Many genomic analyses start by aligning sequencing reads to a linear reference genome. However, linear reference genomes are imperfect, lacking millions of bases of unknown relevance and are unable to reflect the genetic diversity of populations. This makes reference-guided methods susceptible to reference-allele bias. To overcome such limitations, we build a pangenome from six reference-quality assemblies from taurine and indicine cattle as well as yak. The pangenome contains an additional 70,329,827 bases compared to the reference genome. Our multiassembly approach reveals 30 and 10.1 million bases private to yak and indicine cattle, respectively, and between 3.3 and 4.4 million bases unique to each taurine assembly. Utilizing transcriptomes from 56 cattle, we show that these nonreference sequences encode transcripts that hitherto remained undetected from the reference genome. We uncover genes, primarily encoding proteins contributing to immune response and pathogen-mediated immunomodulation, differentially expressed between -infected and noninfected cattle that are also undetectable in the reference genome. Using whole-genome sequencing data of cattle from five breeds, we show that reads which were previously misaligned against the reference genome now align accurately to the pangenome sequences. This enables us to discover 83,250 polymorphic sites that segregate within and between breeds of cattle and capture genetic differentiation across breeds. Our work makes a so-far unused source of variation amenable to genetic investigations and provides methods and a framework for establishing and exploiting a more diverse reference genome.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8157972 | PMC |
http://dx.doi.org/10.1073/pnas.2101056118 | DOI Listing |
Clin Epigenetics
January 2025
Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
Alcohol consumption is an important risk factor for multiple diseases. It is typically assessed via self-report, which is open to measurement error through recall bias. Instead, molecular data such as blood-based DNA methylation (DNAm) could be used to derive a more objective measure of alcohol consumption by incorporating information from cytosine-phosphate-guanine (CpG) sites known to be linked to the trait.
View Article and Find Full Text PDFNat Commun
January 2025
Gene Regulation Laboratory, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, OX3 9DS, Oxford, UK.
Individual enhancers are defined as short genomic regulatory elements, bound by transcription factors, and able to activate cell-specific gene expression at a distance, in an orientation-independent manner. Within mammalian genomes, enhancer-like elements may be found individually or within clusters referred to as locus control regions or super-enhancers (SEs). While these behave similarly to individual enhancers with respect to cell specificity, distribution and distance, their orientation-dependence has not been formally tested.
View Article and Find Full Text PDFMod Pathol
January 2025
Department of Pathology and Medical Biology, University Medical Center Groningen, Groningen, the Netherlands; Department of Pathology, Amsterdam University Medical Center, Amsterdam, the Netherlands. Electronic address:
Fibro-osseous tumors of the craniofacial bones are a heterogeneous group of lesions comprising cemento-osseous dysplasia (COD), cemento-ossifying fibroma (COF), juvenile trabecular ossifying fibroma (JTOF), psammomatoid ossifying fibroma (PsOF), fibrous dysplasia (FD), and low-grade osteosarcoma (LGOS) with overlapping clinicopathological features. However, their clinical behavior and treatment differ significantly, underlining the need for accurate diagnosis. Molecular diagnostic markers exist for subsets of these tumors, including GNAS mutations in FD, SATB2 fusions in PsOF, mutations involving the RAS-MAPK signaling pathway in COD, and MDM2 amplification in LGOS.
View Article and Find Full Text PDFDiagn Microbiol Infect Dis
January 2025
National Reference Laboratory of Control and Monitoring of Antibiotic Resistance (NRL-CMAR), Department Microbiology, National Center of Infectious and Parasitic Diseases (NCIPD), 26 Yanko Sakazov Blvd., Sofia, Bulgaria.
Increased incidence of Clostridioides difficile infections were documented in Bulgarian hospitals during COVID-19. WGS was performed on 39 isolates from seven hospitals during 2015-2022. Antimicrobial resistance and toxin genes were inferred from genomes.
View Article and Find Full Text PDFSci China Life Sci
January 2025
Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu, 610000, China.
Genomic sources from China are underrepresented in the population-specific reference database. We performed whole-genome sequencing or genome-wide genotyping on 1,207 individuals from four linguistically diverse groups (1,081 Sinitic, 56 Mongolic, 40 Turkic, and 30 Tibeto-Burman people) living in North China included in the 10K Chinese People Genomic Diversity Project (10K_CPGDP) to characterize the genetic architecture and adaptative history of ethnic groups in the Silk Road Region of China. We observed a population split between Northwest Chinese minorities (NWCMs) and Han Chinese since the Upper Paleolithic and later Neolithic genetic differentiation within NWCMs.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!