AI Article Synopsis

  • - Using only one linear reference genome limits the understanding of genomic diversity; the draft human pangenome shows the need for pangenomics to address these gaps and capture more genetic variation.
  • - A new tool called Panacus (pangenome-abacus) has been developed to efficiently analyze pangenomes, capable of processing large human pangenome graphs quickly, producing interactive visualizations in under an hour.
  • - Panacus is open-source and built in Rust, available for installation through Bioconda, with its source code and documentation accessible on GitHub.

Article Abstract

Motivation: Using a single linear reference genome poses a limitation to exploring the full genomic diversity of a species. The release of a draft human pangenome underscores the increasing relevance of pangenomics to overcome these limitations. Pangenomes are commonly represented as graphs, which can represent billions of base pairs of sequence. Presently, there is a lack of scalable software able to perform key tasks on pangenomes, such as quantifying universally shared sequence across genomes (the core genome) and measuring the extent of genomic variability as a function of sample size (pangenome growth).

Results: We introduce Panacus (pangenome-abacus), a tool designed to rapidly perform these tasks and visualize the results in interactive plots. Panacus can process GFA files, the accepted standard for pangenome graphs, and is able to analyze a human pangenome graph with 110 million nodes in <1 h.

Availability And Implementation: Panacus is implemented in Rust and is published as Open Source software under the MIT license. The source code and documentation are available at https://github.com/marschall-lab/panacus. Panacus can be installed via Bioconda at https://bioconda.github.io/recipes/panacus/README.html.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665632PMC
http://dx.doi.org/10.1093/bioinformatics/btae720DOI Listing

Publication Analysis

Top Keywords

human pangenome
8
pangenome
5
panacus fast
4
fast exact
4
exact pangenome
4
pangenome growth
4
growth core
4
core size
4
size estimation
4
estimation motivation
4

Similar Publications

Aligning genomes into common coordinates is central to pangenome analysis and construction, but it is also computationally expensive. Multi-sequence maximal unique matches (multi-MUMs) are guideposts for core genome alignments, helping to frame and solve the multiple alignment problem. We introduce Mumemto, a tool that computes multi-MUMs and other match types across large pangenomes.

View Article and Find Full Text PDF

GDBr: genomic signature interpretation tool for DNA double-strand break repair mechanisms.

Nucleic Acids Res

January 2025

Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, 99, Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea.

Large genetic variants can be generated via homologous recombination (HR), such as polymerase theta-mediated end joining (TMEJ) or single-strand annealing (SSA). Given that these HR-based mechanisms leave specific genomic signatures, we developed GDBr, a genomic signature interpretation tool for DNA double-strand break repair mechanisms using high-quality genome assemblies. We applied GDBr to a draft human pangenome reference.

View Article and Find Full Text PDF

non-typhoidal is a major contributor to diarrheal diseases, with over 2600 serovars identified across diverse environments. In Mexico, serovars Newport and Anatum have shown a marked increase, especially in foodborne disease, posing a public health problem. We conducted a cross-sectional study from 2021 to 2023 using active epidemiological surveillance to assess contamination in ground beef and pork at butcher shops nationwide.

View Article and Find Full Text PDF

strains S3W10 and SS15, isolated from shrimp ponds, exhibit potential probiotic benefits for aquaculture. In this study, the genomic features of S3W10 and SS15 were thoroughly characterized to evaluate their probiotic properties and safety for aquaculture use. The genomes of S3W10 and SS15 consist of 130 and 74 contigs, with sizes of 4.

View Article and Find Full Text PDF

Introduction: Accurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the generic genotyping workflows are unable to accurately infer copy numbers and complete genotypes of individual KIR genes from next-generation sequencing data. Thus, specialized genotyping tools are needed to genotype this complex region.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!