Despite only 8% of cattle being found in Europe, European breeds dominate current genetic resources. This adversely impacts cattle research in other important global cattle breeds, especially those from Africa for which genomic resources are particularly limited, despite their disproportionate importance to the continent's economies. To mitigate this issue, we have generated assemblies of African breeds, which have been integrated with genomic data for 294 diverse cattle into a graph genome that incorporates global cattle diversity. We illustrate how this more representative reference assembly contains an extra 116.1 Mb (4.2%) of sequence absent from the current Hereford sequence and consequently inaccessible to current studies. We further demonstrate how using this graph genome increases read mapping rates, reduces allelic biases and improves the agreement of structural variant calling with independent optical mapping data. Consequently, we present an improved, more representative, reference assembly that will improve global cattle research.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8854726 | PMC |
http://dx.doi.org/10.1038/s41467-022-28605-0 | DOI Listing |
Brief Bioinform
November 2024
Center for Genomics and Biotechnology, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, No. 15 Shangxiadian Road, Cangshan District, Fuzhou 350002, China.
Spatial transcriptomics (ST) technologies enable dissecting the tissue architecture in spatial context. To perceive the global contextual information of gene expression patterns in tissue, the spatial dependence of cells must be fully considered by integrating both local and non-local features by means of spatial-context-aware. However, the current ST integration algorithm ignores for ST dropouts, which impedes the spatial-aware of ST features, resulting in challenges in the accuracy and robustness of microenvironmental heterogeneity detecting, spatial domain clustering, and batch-effects correction.
View Article and Find Full Text PDFLebniz Int Proc Inform
August 2024
Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.
Modern sequencing technologies allow for the addition of short-sequence tags, known as anchors, to both ends of a captured molecule. Anchors are useful in assembling the full-length sequence of a captured molecule as they can be used to accurately determine the endpoints. One representative of such anchor-enabled technology is LoopSeq Solo, a synthetic long read (SLR) sequencing protocol.
View Article and Find Full Text PDFiScience
December 2024
University of Helsinki, Helsinki, Finland.
Brief Bioinform
November 2024
Ruhr University Bochum, Medical Faculty, Core Unit Bioinformatics - CUBiMed.RUB, Universitätsstr. 105, 44789 Bochum, Germany.
Due to computational resource limitations, in mass spectrometry based proteomics only a limited set of peptide sequences is used for the matching against measured spectra. We present an approach to represent proteins by graphs and allow not only the canonical sequences but also known isoforms and annotated amino acid variations, e.g.
View Article and Find Full Text PDFPlant Biotechnol J
January 2025
Bioinformatics Group, Wageningen University & Research, Wageningen, The Netherlands.
With advances in long-read sequencing and assembly techniques, haplotype-resolved (phased) genome assemblies are becoming more common, also in the field of plant genomics. Computational tools to effectively explore these phased genomes, particularly for polyploid genomes, are currently limited. Here we describe a new strategy adopting a pangenome approach.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!