Complete datasets of genetic variants are key to biodiversity genomic studies. Long-read sequencing technologies allow the routine assembly of highly contiguous, haplotype-resolved reference genomes. However, even when complete, reference genomes from a single individual may bias downstream analyses and fail to adequately represent genetic diversity within a population or species. Pangenome graphs assembled from aligned collections of high-quality genomes can overcome representation bias by integrating sequence information from multiple genomes from the same population, species or genus into a single reference. Here, we review the available tools and data structures to build, visualize and manipulate pangenome graphs while providing practical examples and discussing their applications in biodiversity and conservation genomics across the tree of life.

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-024-02029-6DOI Listing

Publication Analysis

Top Keywords

pangenome graphs
12
applications biodiversity
8
reference genomes
8
population species
8
graphs applications
4
biodiversity genomics
4
genomics complete
4
complete datasets
4
datasets genetic
4
genetic variants
4

Similar Publications

Complete datasets of genetic variants are key to biodiversity genomic studies. Long-read sequencing technologies allow the routine assembly of highly contiguous, haplotype-resolved reference genomes. However, even when complete, reference genomes from a single individual may bias downstream analyses and fail to adequately represent genetic diversity within a population or species.

View Article and Find Full Text PDF

With advances in long-read sequencing and assembly techniques, haplotype-resolved (phased) genome assemblies are becoming more common, also in the field of plant genomics. Computational tools to effectively explore these phased genomes, particularly for polyploid genomes, are currently limited. Here we describe a new strategy adopting a pangenome approach.

View Article and Find Full Text PDF

Fast exact gap-affine partial order alignment with POASTA.

Bioinformatics

January 2025

Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, United States.

Motivation: Partial order alignment is a widely used method for computing multiple sequence alignments, with applications in genome assembly and pangenomics, among many others. Current algorithms to compute the optimal, gap-affine partial order alignment do not scale well to larger graphs and sequences. While heuristic approaches exist, they do not guarantee optimal alignment and sacrifice alignment accuracy.

View Article and Find Full Text PDF
Article Synopsis
  • This research presents a comprehensive pangenome assembly of 27 pig genomes, marking a major milestone in pig genomic data.
  • The analysis highlights how structural variations influence adaptation and breed-specific traits, identifying a crucial gene linked to intramuscular fat and meat quality.
  • These findings emphasize the importance of multi-genome studies in discovering significant genomic traits for agricultural improvement.
View Article and Find Full Text PDF

Cattle have been selectively bred for coat color, spotting, and depigmentation patterns. The assumed autosomal dominant inherited genetic variants underlying the characteristic white head of Fleckvieh, Simmental, and Hereford cattle have not been identified yet, although the contribution of structural variation upstream the gene has been proposed. Here, we construct a graph pangenome from 24 haplotype assemblies representing seven taurine cattle breeds to identify and characterize the white head-associated locus for the first time based on long-read sequencing data and pangenome analyses.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!