Panacus: fast and exact pangenome growth and core size estimation.

Luca Parmigiani Erik Garrison Jens Stoye Tobias Marschall Daniel Doerr

Bioinformatics

Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf 40225, Germany.

Published: November 2024

- Using only one linear reference genome limits the understanding of genomic diversity; the draft human pangenome shows the need for pangenomics to address these gaps and capture more genetic variation.
- A new tool called Panacus (pangenome-abacus) has been developed to efficiently analyze pangenomes, capable of processing large human pangenome graphs quickly, producing interactive visualizations in under an hour.
- Panacus is open-source and built in Rust, available for installation through Bioconda, with its source code and documentation accessible on GitHub.

Motivation: Using a single linear reference genome poses a limitation to exploring the full genomic diversity of a species. The release of a draft human pangenome underscores the increasing relevance of pangenomics to overcome these limitations. Pangenomes are commonly represented as graphs, which can represent billions of base pairs of sequence. Presently, there is a lack of scalable software able to perform key tasks on pangenomes, such as quantifying universally shared sequence across genomes (the core genome) and measuring the extent of genomic variability as a function of sample size (pangenome growth).

Results: We introduce Panacus (pangenome-abacus), a tool designed to rapidly perform these tasks and visualize the results in interactive plots. Panacus can process GFA files, the accepted standard for pangenome graphs, and is able to analyze a human pangenome graph with 110 million nodes in <1 h.

Availability And Implementation: Panacus is implemented in Rust and is published as Open Source software under the MIT license. The source code and documentation are available at https://github.com/marschall-lab/panacus. Panacus can be installed via Bioconda at https://bioconda.github.io/recipes/panacus/README.html.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665632	PMC
http://dx.doi.org/10.1093/bioinformatics/btae720	DOI Listing

Publication Analysis

Top Keywords

human pangenome

pangenome

panacus fast

fast exact

exact pangenome

pangenome growth

growth core

core size

size estimation

estimation motivation

Similar Publications

Mumemto: efficient maximal matching across pangenomes.

bioRxiv

January 2025

Vikram S Shivakumar Ben Langmead

Aligning genomes into common coordinates is central to pangenome analysis and construction, but it is also computationally expensive. Multi-sequence maximal unique matches (multi-MUMs) are guideposts for core genome alignments, helping to frame and solve the multiple alignment problem. We introduce Mumemto, a tool that computes multi-MUMs and other match types across large pangenomes.

View Article and Find Full Text PDF

Similar Publications

GDBr: genomic signature interpretation tool for DNA double-strand break repair mechanisms.

Nucleic Acids Res

January 2025

Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, 99, Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea.

Hyunwoo Ryu Hyunho Han Chuna Kim Jun Kim

Large genetic variants can be generated via homologous recombination (HR), such as polymerase theta-mediated end joining (TMEJ) or single-strand annealing (SSA). Given that these HR-based mechanisms leave specific genomic signatures, we developed GDBr, a genomic signature interpretation tool for DNA double-strand break repair mechanisms using high-quality genome assemblies. We applied GDBr to a draft human pangenome reference.

View Article and Find Full Text PDF

Similar Publications

Geography, Antimicrobial Resistance, and Genomics of (Serotypes Newport and Anatum) from Meat in Mexico (2021-2023).

Microorganisms

December 2024

Centro de Investigación en Salud Poblacional, Instituto Nacional de Salud Pública (INSP), Morelos 62100, Mexico.

Eduardo Canek Reynoso Enrique Jesús Delgado-Suárez Cindy Fabiola Hernández-Pérez Yaselda Chavarin-Pineda Elizabeth Ernestina Godoy-Lozano

non-typhoidal is a major contributor to diarrheal diseases, with over 2600 serovars identified across diverse environments. In Mexico, serovars Newport and Anatum have shown a marked increase, especially in foodborne disease, posing a public health problem. We conducted a cross-sectional study from 2021 to 2023 using active epidemiological surveillance to assess contamination in ground beef and pork at butcher shops nationwide.

View Article and Find Full Text PDF

Similar Publications

Genomic Characterization of Probiotic Purple Nonsulfur Bacteria Strains S3W10 and SS15: Implications for Enhanced Shrimp Aquaculture.

Life (Basel)

December 2024

Department of Biomedical Sciences and Biomedical Engineering, Faculty of Medicine, Prince of Songkla University, Songkhla 90110, Thailand.

Chollachai Klaysubun Nattarika Chaichana Sirikan Suwannasin Kamonnut Singkhamanan Thunchanok Yaikhan

strains S3W10 and SS15, isolated from shrimp ponds, exhibit potential probiotic benefits for aquaculture. In this study, the genomic features of S3W10 and SS15 were thoroughly characterized to evaluate their probiotic properties and safety for aquaculture use. The genomes of S3W10 and SS15 consist of 130 and 74 contigs, with sizes of 4.

View Article and Find Full Text PDF

Similar Publications

Geny: a genotyping tool for allelic decomposition of killer cell immunoglobulin-like receptor genes.

Front Immunol

January 2025

Department of Computer Science, University of Victoria, Victoria, BC, Canada.

Qinghui Zhou Mazyar Ghezelji Ananth Hari Michael K B Ford Connor Holley

Introduction: Accurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the generic genotyping workflows are unable to accurately infer copy numbers and complete genotypes of individual KIR genes from next-generation sequencing data. Thus, specialized genotyping tools are needed to genotype this complex region.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!