The advent of long-read sequencing of microbiomes necessitates the development of new taxonomic profilers tailored to long-read shotgun metagenomic datasets. Here, we introduce Lemur and Magnet, a pair of tools optimized for lightweight and accurate taxonomic profiling for long-read shotgun metagenomic datasets. Lemur is a marker-gene-based method that leverages an EM algorithm to reduce false positive calls while preserving true positives; Magnet is a whole-genome read-mapping-based method that provides detailed presence and absence calls for bacterial genomes. We demonstrate that Lemur and Magnet can run in minutes to hours on a laptop with 32 GB of RAM, even for large inputs, a crucial feature given the portability of long-read sequencing machines. Furthermore, the marker gene database used by Lemur is only 4 GB and contains information from over 300,000 RefSeq genomes. Lemur and Magnet are open-source and available at https://github.com/treangenlab/lemur and https://github.com/treangenlab/magnet.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11185576PMC
http://dx.doi.org/10.1101/2024.06.01.596961DOI Listing

Publication Analysis

Top Keywords

lemur magnet
16
metagenomic datasets
12
taxonomic profiling
8
profiling long-read
8
datasets lemur
8
long-read sequencing
8
long-read shotgun
8
shotgun metagenomic
8
lemur
6
long-read
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!