Background: Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system.

Results: We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub ( https://github.com/MW55/Natrix ) or as a Docker container on DockerHub ( https://hub.docker.com/r/mw55/natrix ).

Conclusion: Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7667751PMC
http://dx.doi.org/10.1186/s12859-020-03852-4DOI Listing

Publication Analysis

Top Keywords

amplicon sequencing
12
workflow processing
8
existing workflows
8
workflow management
8
version control
8
workflow
7
sequencing
6
natrix
4
natrix snakemake-based
4
snakemake-based workflow
4

Similar Publications

Plants are colonized by a vast array of microorganisms that outstrip plant cell densities and genes, thus referred to as plant's second genome or extended genome. The microbial communities exert a significant influence on the vigor, growth, development and productivity of plants by supporting nutrient acquisition, organic matter decomposition and tolerance against biotic and abiotic stresses such as heat, high salt, drought and disease, by regulating plant defense responses. The rhizosphere is a complex micro-ecological zone in the direct vicinity of plant roots and is considered a hotspot of microbial diversity.

View Article and Find Full Text PDF

Evaluating endogenous viral targets as potential treatment monitoring surrogates for onsite non-potable water reuse.

Environ Sci (Camb)

February 2024

U.S. Environmental Protection Agency, Office of Research and Development, 26 W. Martin Luther King Drive, Cincinnati, OH 45268, USA.

Onsite non-potable water reuse systems (ONWS) are decentralized systems that treat and repurpose locally collected waters ( greywater or combined wastewater) for uses such as irrigation and flushing toilets. To ensure that treatment is meeting risk benchmarks, it is necessary to monitor the efficacy of pathogen removal. However, accurate assessment of pathogen reduction is hampered by their sporadic and low occurrence rates in source waters and concentrations in treated water that are generally below measurement detection limits.

View Article and Find Full Text PDF

Targeted metagenomics is a rapidly expanding technology to analyze complex biological samples and genetic monitoring of environmental samples. In this research field, data analytical aspects play a crucial role. In order to teach targeted metagenomics data analysis, we developed a 4-week inquiry-driven modular course-based undergraduate research experience (mCURE) using publicly available Australian coral microbiome DNA sequencing data and associated metadata.

View Article and Find Full Text PDF

First report of Anaplasma marginale and Anaplasma ovis in goats in Kelantan, Malaysia.

Trop Biomed

December 2024

Departments of Veterinary Parasitology and Entomology, University of Maiduguri, P.M.B. 1069, Maiduguri 600230, Nigeria.

Anaplasma species are obligate rickettsial intraerythrocytic pathogens that cause an important tick-borne disease of economic importance in livestock production in many countries. Anaplasma species have been detected from farm animals worldwide, there is a paucity of information on Anaplasma infections in goats from Malaysia. Thus, this study aimed to assess the infection rate and identify Anaplasma species and some selected risk factors in goats across selected districts in Kelantan, Malaysia.

View Article and Find Full Text PDF

Background: This cross-sectional study aimed to compare the composition of the submucosal microbiome of peri-implantitis with paired and unpaired healthy implant samples.

Methods: We evaluated submucosal plaque samples obtained in 39 cases, including 13 cases of peri-implantitis, 13 cases involving healthy implants from the same patient (paired samples), and 13 cases involving healthy implants from different individuals (unpaired samples). The patients were evaluated using next-generation genomic sequencing (Illumina) based on 16S rRNA gene amplification.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!