The advent of next-generation sequencing has allowed for higher-throughput determination of which species live within a specific location. Here we establish that three analysis methods for estimating diversity within samples-namely, Operational Taxonomic Units; the newer Amplicon Sequence Variants; and a method commonly found in sequence analysis, minhash-are affected by various properties of these sequence data. Using simulations we show that the presence of Single Nucleotide Polymorphisms and the depth of coverage from each species affect the correlations between these approaches. Through this analysis, we provide insights which would affect the decisions on the application of each method. Specifically, the presence of sequence read errors and variability in sequence read coverage deferentially affects these processing methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7189665PMC
http://dx.doi.org/10.3390/mps3010022DOI Listing

Publication Analysis

Top Keywords

sequence read
8
sequence
5
uncovering effects
4
effects structure
4
structure metabarcode
4
metabarcode sequences
4
sequences metagenetic
4
metagenetic microbiome
4
analysis
4
microbiome analysis
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!