The study of microbial communities crucially relies on the comparison of metagenomic next-generation sequencing data sets, for which several methods have been designed in recent years. Here, we review three key challenges in the comparison of such data sets: species identification and quantification, the efficient computation of distances between metagenomic samples and the identification of metagenomic features associated with a phenotype such as disease status. We present current solutions for such challenges, considering both reference-based methods relying on a database of reference genomes and reference-free methods working directly on all sequencing reads from the samples.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bib/bbaa121 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!