Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a -mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity ( ) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profiles and show that additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes. Estimation of the coverage provided by a metagenomic data set, i.e., what fraction of the microbial community was sampled by DNA sequencing, represents an essential first step of every culture-independent genomic study that aims to robustly assess the sequence diversity present in a sample. However, estimation of coverage remains elusive because of several technical limitations associated with high computational requirements and limiting statistical approaches to quantify diversity. Here we described Nonpareil 3, a new bioinformatics algorithm that circumvents several of these limitations and thus can facilitate culture-independent studies in clinical or environmental settings, independent of the sequencing platform employed. In addition, we present a new metric of sequence diversity based on rarefied coverage and demonstrate its use in communities from diverse ecosystems.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5893860 | PMC |
http://dx.doi.org/10.1128/mSystems.00039-18 | DOI Listing |
Inflammation
January 2025
Department of Geriatrics, Respiratory Medicine, Xiangya Hospital, Central South University, Changsha, 410008, China.
Chronic obstructive pulmonary disease (COPD) is a prevalent chronic inflammatory airway disease with high incidence and significant disease burden. R-loops, functional chromatin structure formed during transcription, are closely associated with inflammation due to its aberrant formation. However, the role of R-loop regulators (RLRs) in COPD remains unclear.
View Article and Find Full Text PDFCell Biol Toxicol
January 2025
Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, No. 36 Sanhao Street, Heping District, Shenyang , Liaoning Province, China.
NFKB1, a core transcription factor critical in various biological process (BP), is increasingly studied for its role in tumors. This research combines literature reviews, meta-analyses, and bioinformatics to systematically explore NFKB1's involvement in tumor initiation and progression. A unique focus is placed on the NFKB1-94 ATTG promoter polymorphism, highlighting its association with cancer risk across diverse genetic models and ethnic groups, alongside comprehensive analysis of pan-cancer expression patterns and drug sensitivity.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Department of Physics, 845 W Taylor St, University of Illinois Chicago, Chicago, IL 60607, USA.
Altered DNA dynamics at lesion sites are implicated in how DNA repair proteins sense damage within genomic DNA. Using laser temperature-jump (T-jump) spectroscopy combined with cytosine-analog Förster Resonance Energy Transfer (FRET) probes that sense local DNA conformations, we measured the intrinsic dynamics of DNA containing 3 base-pair mismatches recognized in vitro by Rad4 (yeast ortholog of XPC). Rad4/XPC recognizes diverse lesions from environmental mutagens and initiates nucleotide excision repair.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
State Key Laboratory of Agricultural Microbiology and College of Life Science and Technology, Hubei Hongshan Laboratory, Huazhong Agricultural University, Shizishan Road No.1, Hongshan District, 430070 Wuhan, China.
Primase-polymerases (PrimPols) play divergent functions from DNA replication to DNA repair in all three life domains. In archaea and bacteria, numerous and diverse PPs are encoded by mobile genetic elements (MGEs) and act as the replicases for their MGEs. However, their varying activities and functions are not fully understood.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, 10257, Lithuania.
The expansion of single-cell analytical techniques has empowered the exploration of diverse biological questions at the individual cells. Droplet-based single-cell RNA sequencing (scRNA-seq) methods have been particularly widely used due to their high-throughput capabilities and small reaction volumes. While commercial systems have contributed to the widespread adoption of droplet-based scRNA-seq, their relatively high cost limits the ability to profile large numbers of cells and samples.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!