Metabarcoding is a powerful molecular tool for simultaneously surveying hundreds to thousands of species from a single sample, underpinning microbiome and environmental DNA (eDNA) methods. Deriving quantitative estimates of underlying biological communities from metabarcoding is critical for enhancing the utility of such approaches for health and conservation. Recent work has demonstrated that correcting for amplification biases in genetic metabarcoding data can yield quantitative estimates of template DNA concentrations. However, a major source of uncertainty in metabarcoding data stems from non-detections across technical PCR replicates where one replicate fails to detect a species observed in other replicates. Such non-detections are a special case of variability among technical replicates in metabarcoding data. While many sampling and amplification processes underlie observed variation in metabarcoding data, understanding the causes of non-detections is an important step in distinguishing signal from noise in metabarcoding studies. Here, we use both simulated and empirical data to 1) suggest how non-detections may arise in metabarcoding data, 2) outline steps to recognize uninformative data in practice, and 3) identify the conditions under which amplicon sequence data can reliably detect underlying biological signals. We show with both simulations and empirical data that, for a given species, the rate of non-detections among technical replicates is a function of both the template DNA concentration and species-specific amplification efficiency. Consequently, we conclude metabarcoding datasets are strongly affected by (1) deterministic amplification biases during PCR and (2) stochastic sampling of amplicons during sequencing-both of which we can model-but also by (3) stochastic sampling of rare molecules prior to PCR, which remains a frontier for quantitative metabarcoding. Our results highlight the importance of estimating species-specific amplification efficiencies and critically evaluating patterns of non-detection in metabarcoding datasets to better distinguish environmental signal from the noise inherent in molecular detections of rare targets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10174484PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0285674PLOS

Publication Analysis

Top Keywords

metabarcoding data
24
signal noise
12
metabarcoding
12
data
10
noise metabarcoding
8
quantitative estimates
8
underlying biological
8
amplification biases
8
template dna
8
non-detections technical
8

Similar Publications

Over 15 000 species of fishes are found globally in the marine environment and DNA barcodes are used extensively to describe, catalogue, understand and manage this diversity. The dataset outlined here represents a DNA barcode reference library of the mitochondrial cytochrome c oxidase subunit 1 gene (COI) from 9767 voucher specimens (representing at least 2220 species and 288 families) of marine fishes. This publicly available dataset in the Barcode of Life Data System (BOLD) represents 17 years (2005-2022) of barcoding of marine fishes identified from Australian territorial waters.

View Article and Find Full Text PDF

Urban dust samples were collected in Moscow (Russia) in June 2021. The samples were collected in three functional zones of Moscow (traffic, residential, and recreational) and included air microparticles, leaf dust, and paved dust. Data on the taxonomic composition of bacterial communities were obtained for dust samples, and their functional characteristics were predicted using PICRUSt2 2.

View Article and Find Full Text PDF

Understanding the dietary composition of the Siberian musk deer () is critical for informing conservation efforts, particularly given the species' vulnerable status in the Republic of Korea and its limited ecological data. Previous dietary studies have relied on conventional methods with limitations in taxonomic precision, hindering comprehensive insights into their feeding ecology. To address this gap, we used Next-Generation Sequencing to analyze 16 fecal samples collected in April from Gangwon Province, the habitat of .

View Article and Find Full Text PDF

Characterizing biodiversity using environmental DNA (eDNA) represents a paradigm shift in our capacity for biomonitoring complex environments, both aquatic and terrestrial. However, eDNA biomonitoring is limited by biases toward certain species and the low taxonomic resolution of current metabarcoding approaches. Shotgun metagenomics of eDNA enables the collection of whole ecosystem data by sequencing all molecules present, allowing characterization and identification.

View Article and Find Full Text PDF
Article Synopsis
  • The study of taxonomic composition has shifted from traditional methods to advanced DNA sequencing techniques, particularly metabarcoding, which uses targeted genome portions for high-throughput sequencing.
  • Recent innovations in Oxford Nanopore Technologies have made sequencing more accessible and effective while presenting specific errors and a need for refined bioinformatics tools to handle long-read data.
  • PRONAME, a new open-source pipeline designed for Nanopore data, enhances sequence accuracy and supports custom database integration, achieving over 99.5% accuracy in tests, thus providing a reliable method for analyzing complex biological communities.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!