Background: The study of sample taxonomic composition has evolved from direct observations and labor-intensive morphological studies to different DNA sequencing methodologies. Most of these studies leverage the metabarcoding approach, which involves the amplification of a small taxonomically-informative portion of the genome and its subsequent high-throughput sequencing. Recent advances in sequencing technology brought by Oxford Nanopore Technologies have revolutionized the field, enabling portability, affordable cost and long-read sequencing, therefore leading to a significant increase in taxonomic resolution. However, Nanopore sequencing data exhibit a particular profile, with a higher error rate compared with Illumina sequencing, and existing bioinformatics pipelines for the analysis of such data are scarce and often insufficient, requiring specialized tools to accurately process long-read sequences.
Results: We present PRONAME (PROcessing NAnopore MEtabarcoding data), an open-source, user-friendly pipeline optimized for processing raw Nanopore sequencing data. PRONAME includes precompiled databases for complete 16S sequences (Silva138 and Greengenes2) and a newly developed and curated database dedicated to bacterial 16S-ITS-23S operon sequences. The user can also provide a custom database if desired, therefore enabling the analysis of metabarcoding data for any domain of life. The pipeline significantly improves sequence accuracy, implementing innovative error-correction strategies and taking advantage of the new sequencing chemistry to produce high-quality duplex reads. Evaluations using a mock community have shown that PRONAME delivers consensus sequences demonstrating at least 99.5% accuracy with standard settings (and up to 99.7%), making it a robust tool for genomic analysis of complex multi-species communities.
Conclusion: PRONAME meets the challenges of long-read Nanopore data processing, offering greater accuracy and versatility than existing pipelines. By integrating Nanopore-specific quality filtering, clustering and error correction, PRONAME produces high-precision consensus sequences. This brings the accuracy of Nanopore sequencing close to that of Illumina sequencing, while taking advantage of the benefits of long-read technologies.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695402 | PMC |
http://dx.doi.org/10.3389/fbinf.2024.1483255 | DOI Listing |
CRISPR J
January 2025
OceanOmics, The Minderoo Foundation, Perth, Australia.
Characterizing biodiversity using environmental DNA (eDNA) represents a paradigm shift in our capacity for biomonitoring complex environments, both aquatic and terrestrial. However, eDNA biomonitoring is limited by biases toward certain species and the low taxonomic resolution of current metabarcoding approaches. Shotgun metagenomics of eDNA enables the collection of whole ecosystem data by sequencing all molecules present, allowing characterization and identification.
View Article and Find Full Text PDFFront Bioinform
December 2024
Bioengineering Unit, Life Sciences Department, Walloon Agricultural Research Centre, Gembloux, Belgium.
PLoS One
January 2025
Danau Girang Field Centre, c/o Sabah Wildlife Department, Kota Kinabalu, Malaysia.
Characterizing the feeding ecology of threatened species is essential to establish appropriate conservation strategies. We focused our study on the proboscis monkey (Nasalis larvatus), an endangered primate species which is endemic to the island of Borneo. Our survey was conducted in the Lower Kinabatangan Wildlife Sanctuary (LKWS), a riverine protected area that is surrounded by oil palm plantations.
View Article and Find Full Text PDFPLoS One
December 2024
Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland.
Sci Rep
December 2024
Food System Integrity, AgResearch Limited, Hopkirk Research Institute, Massey University, Cnr University Avenue and Library Road, Private Bag 11008, Palmerston North, 4442, New Zealand.
Understanding the composition of complex Escherichia coli populations from the environment is necessary for identifying strategies to reduce the impacts of fecal contamination and protect public health. Metabarcoding targeting the hypervariable gene gnd was used to reveal the complex population diversity of E. coli and phenotypically indistinct Escherichia species in water, soil, sediment, aquatic biofilm, and fecal samples from native forest and pastoral sites.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!