AI Article Synopsis

  • The study of taxonomic composition has shifted from traditional methods to advanced DNA sequencing techniques, particularly metabarcoding, which uses targeted genome portions for high-throughput sequencing.
  • Recent innovations in Oxford Nanopore Technologies have made sequencing more accessible and effective while presenting specific errors and a need for refined bioinformatics tools to handle long-read data.
  • PRONAME, a new open-source pipeline designed for Nanopore data, enhances sequence accuracy and supports custom database integration, achieving over 99.5% accuracy in tests, thus providing a reliable method for analyzing complex biological communities.

Article Abstract

Background: The study of sample taxonomic composition has evolved from direct observations and labor-intensive morphological studies to different DNA sequencing methodologies. Most of these studies leverage the metabarcoding approach, which involves the amplification of a small taxonomically-informative portion of the genome and its subsequent high-throughput sequencing. Recent advances in sequencing technology brought by Oxford Nanopore Technologies have revolutionized the field, enabling portability, affordable cost and long-read sequencing, therefore leading to a significant increase in taxonomic resolution. However, Nanopore sequencing data exhibit a particular profile, with a higher error rate compared with Illumina sequencing, and existing bioinformatics pipelines for the analysis of such data are scarce and often insufficient, requiring specialized tools to accurately process long-read sequences.

Results: We present PRONAME (PROcessing NAnopore MEtabarcoding data), an open-source, user-friendly pipeline optimized for processing raw Nanopore sequencing data. PRONAME includes precompiled databases for complete 16S sequences (Silva138 and Greengenes2) and a newly developed and curated database dedicated to bacterial 16S-ITS-23S operon sequences. The user can also provide a custom database if desired, therefore enabling the analysis of metabarcoding data for any domain of life. The pipeline significantly improves sequence accuracy, implementing innovative error-correction strategies and taking advantage of the new sequencing chemistry to produce high-quality duplex reads. Evaluations using a mock community have shown that PRONAME delivers consensus sequences demonstrating at least 99.5% accuracy with standard settings (and up to 99.7%), making it a robust tool for genomic analysis of complex multi-species communities.

Conclusion: PRONAME meets the challenges of long-read Nanopore data processing, offering greater accuracy and versatility than existing pipelines. By integrating Nanopore-specific quality filtering, clustering and error correction, PRONAME produces high-precision consensus sequences. This brings the accuracy of Nanopore sequencing close to that of Illumina sequencing, while taking advantage of the benefits of long-read technologies.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695402PMC
http://dx.doi.org/10.3389/fbinf.2024.1483255DOI Listing

Publication Analysis

Top Keywords

metabarcoding data
12
consensus sequences
12
nanopore sequencing
12
sequencing
10
user-friendly pipeline
8
process long-read
8
long-read nanopore
8
nanopore metabarcoding
8
sequencing data
8
illumina sequencing
8

Similar Publications

Characterizing biodiversity using environmental DNA (eDNA) represents a paradigm shift in our capacity for biomonitoring complex environments, both aquatic and terrestrial. However, eDNA biomonitoring is limited by biases toward certain species and the low taxonomic resolution of current metabarcoding approaches. Shotgun metagenomics of eDNA enables the collection of whole ecosystem data by sequencing all molecules present, allowing characterization and identification.

View Article and Find Full Text PDF
Article Synopsis
  • The study of taxonomic composition has shifted from traditional methods to advanced DNA sequencing techniques, particularly metabarcoding, which uses targeted genome portions for high-throughput sequencing.
  • Recent innovations in Oxford Nanopore Technologies have made sequencing more accessible and effective while presenting specific errors and a need for refined bioinformatics tools to handle long-read data.
  • PRONAME, a new open-source pipeline designed for Nanopore data, enhances sequence accuracy and supports custom database integration, achieving over 99.5% accuracy in tests, thus providing a reliable method for analyzing complex biological communities.
View Article and Find Full Text PDF

Characterizing the feeding ecology of threatened species is essential to establish appropriate conservation strategies. We focused our study on the proboscis monkey (Nasalis larvatus), an endangered primate species which is endemic to the island of Borneo. Our survey was conducted in the Lower Kinabatangan Wildlife Sanctuary (LKWS), a riverine protected area that is surrounded by oil palm plantations.

View Article and Find Full Text PDF
Article Synopsis
  • Advances in technology for species identification have led to the development of a new field sampling method that integrates sensor data with automated processing.
  • The LIFEPLAN project employs five systematic field sampling methods, accessible to individuals with basic biology or ecology training, to gather biodiversity data globally.
  • The article details the steps for collecting various types of data, such as images, audio, invertebrate samples, soil, and air, while emphasizing the importance of metadata and acknowledging that technology and equipment will continue to evolve for improved data collection.
View Article and Find Full Text PDF

Impact of land-use and fecal contamination on Escherichia populations in environmental samples.

Sci Rep

December 2024

Food System Integrity, AgResearch Limited, Hopkirk Research Institute, Massey University, Cnr University Avenue and Library Road, Private Bag 11008, Palmerston North, 4442, New Zealand.

Understanding the composition of complex Escherichia coli populations from the environment is necessary for identifying strategies to reduce the impacts of fecal contamination and protect public health. Metabarcoding targeting the hypervariable gene gnd was used to reveal the complex population diversity of E. coli and phenotypically indistinct Escherichia species in water, soil, sediment, aquatic biofilm, and fecal samples from native forest and pastoral sites.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!