A cattle graph genome incorporating global breed diversity.

Nat Commun

The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK.

Published: February 2022

Despite only 8% of cattle being found in Europe, European breeds dominate current genetic resources. This adversely impacts cattle research in other important global cattle breeds, especially those from Africa for which genomic resources are particularly limited, despite their disproportionate importance to the continent's economies. To mitigate this issue, we have generated assemblies of African breeds, which have been integrated with genomic data for 294 diverse cattle into a graph genome that incorporates global cattle diversity. We illustrate how this more representative reference assembly contains an extra 116.1 Mb (4.2%) of sequence absent from the current Hereford sequence and consequently inaccessible to current studies. We further demonstrate how using this graph genome increases read mapping rates, reduces allelic biases and improves the agreement of structural variant calling with independent optical mapping data. Consequently, we present an improved, more representative, reference assembly that will improve global cattle research.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8854726	PMC
http://dx.doi.org/10.1038/s41467-022-28605-0	DOI Listing

Publication Analysis

Top Keywords

graph genome

global cattle

cattle graph

representative reference

reference assembly

cattle

genome incorporating

global

incorporating global

global breed

Similar Publications

STMGraph: spatial-context-aware of transcriptomes via a dual-remasked dynamic graph attention model.

Brief Bioinform

November 2024

Center for Genomics and Biotechnology, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, No. 15 Shangxiadian Road, Cangshan District, Fuzhou 350002, China.

Lixian Lin Haoyu Wang Yuxiao Chen Yuanyuan Wang Yujie Xu

Spatial transcriptomics (ST) technologies enable dissecting the tissue architecture in spatial context. To perceive the global contextual information of gene expression patterns in tissue, the spatial dependence of cells must be fully considered by integrating both local and non-local features by means of spatial-context-aware. However, the current ST integration algorithm ignores for ST dropouts, which impedes the spatial-aware of ST features, resulting in challenges in the accuracy and robustness of microenvironmental heterogeneity detecting, spatial domain clustering, and batch-effects correction.

View Article and Find Full Text PDF

Similar Publications

Anchorage Accurately Assembles Anchor-Flanked Synthetic Long Reads.

Lebniz Int Proc Inform

August 2024

Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.

Xiaofei Carl Zang Xiang Li Kyle Metcalfe Tuval Ben-Yehezkel Ryan Kelley

Modern sequencing technologies allow for the addition of short-sequence tags, known as anchors, to both ends of a captured molecule. Anchors are useful in assembling the full-length sequence of a captured molecule as they can be used to accurately determine the endpoints. One representative of such anchor-enabled technology is LoopSeq Solo, a synthetic long read (SLR) sequencing protocol.

View Article and Find Full Text PDF

Similar Publications

Flowtigs: Safety in flow decompositions for assembly graphs.

iScience

December 2024

University of Helsinki, Helsinki, Finland.

Francisco Sena Eliel Ingervo Shahbaz Khan Andrey Prjibelski Sebastian Schmidt

Article Synopsis

A network flow is represented by a collection of weighted walks that combine to create the overall flow; this article characterizes the specific walks involved in these flow decompositions.
The authors introduce a new algorithm that can efficiently identify and structure all maximal flowtigs, which are key components of flow decompositions in a network.
The practical application focuses on metagenomic assembly, demonstrating that using flowtigs improves the continuity of assembly results compared to traditional methods, both in simulations and real data contexts.

View Article and Find Full Text PDF

Similar Publications

ProtGraph: a tool for the quick and comprehensive exploration and exploitation of the peptide search space derived from protein sequence databases using graphs.

Brief Bioinform

November 2024

Ruhr University Bochum, Medical Faculty, Core Unit Bioinformatics - CUBiMed.RUB, Universitätsstr. 105, 44789 Bochum, Germany.

Dominik Lux Katrin Marcus-Alic Martin Eisenacher Julian Uszkoreit

Due to computational resource limitations, in mass spectrometry based proteomics only a limited set of peptide sequences is used for the matching against measured spectra. We present an approach to represent proteins by graphs and allow not only the canonical sequences but also known isoforms and annotated amino acid variations, e.g.

View Article and Find Full Text PDF

Similar Publications

Exploring intra- and intergenomic variation in haplotype-resolved pangenomes.

Plant Biotechnol J

January 2025

Bioinformatics Group, Wageningen University & Research, Wageningen, The Netherlands.

Eef M Jonkheer Dick de Ridder Theo A J van der Lee Jorn R de Haan Lidija Berke

With advances in long-read sequencing and assembly techniques, haplotype-resolved (phased) genome assemblies are becoming more common, also in the field of plant genomics. Computational tools to effectively explore these phased genomes, particularly for polyploid genomes, are currently limited. Here we describe a new strategy adopting a pangenome approach.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!