Assembling Reads Improves Taxonomic Classification of Species.

Genes (Basel)

Department of Computer Science, University of Memphis, Memphis, TN 38152, USA.

Published: August 2020

Most current approach to metagenomic classification employ short next generation sequencing (NGS) reads that are present in metagenomic samples to identify unique genomic regions. NGS reads, however, might not be long enough to differentiate similar genomes. This suggests a potential for using longer reads to improve classification performance. Presently, longer reads tend to have a higher rate of sequencing errors. Thus, given the pros and cons, it remains unclear which types of reads is better for metagenomic classification. We compared two taxonomic classification protocols: a traditional assembly-free protocol and a novel assembly-based protocol. The novel assembly-based protocol consists of assembling short-reads into longer reads, which will be subsequently classified by a traditional taxonomic classifier. We discovered that most classifiers made fewer predictions with longer reads and that they achieved higher classification performance on synthetic metagenomic data. Generally, we observed a significant increase in precision, while having similar recall rates. On real data, we observed similar characteristics that suggest that the classifiers might have similar performance of higher precision with similar recall with longer reads. We have shown a noticeable difference in performance between assembly-based and assembly-free taxonomic classification. This finding strongly suggests that classifying species in metagenomic environments can be achieved with higher overall performance simply by assembling short reads. Further, it also suggests that long-read technologies might be better for species classification.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7465921	PMC
http://dx.doi.org/10.3390/genes11080946	DOI Listing

Publication Analysis

Top Keywords

longer reads

taxonomic classification

reads

classification

metagenomic classification

ngs reads

classification performance

protocol novel

novel assembly-based

assembly-based protocol

Similar Publications

Social Determinants of Health and Cross-Sectional Cognitive Intra-Individual Variability in Adults from the Deep South Living with HIV.

Arch Clin Neuropsychol

January 2025

School of Nursing, University of Alabama at Birmingham, Birmingham, AL 35294, USA.

Victor A Del Bene Pariya L Fazeli Jason A Blake Wei Li Christopher Collette

Cognitive intra-individual variability (IIV) is a sensitive marker of neuropathology and is increased in people with HIV (PWH). In a sample of PWH from the United States Deep South, we examined the relationship of cognitive IIV with cognitive impairment and social determinants of health (SDoH). This secondary analysis included 131 PWH from a larger cognitive training protocol.

View Article and Find Full Text PDF

Similar Publications

Multiple introductions of NRCS-A to the neonatal intensive care unit drive neonatal bloodstream infections: a case-control and environmental genomic survey.

Microb Genom

January 2025

Department of Microbiology, Oxford University Hospitals, Oxford, UK.

Emily A Lees Jessica Gentry Hermione Webster Nicholas Sanderson David Eyre

The NRCS-A strain has emerged as a global cause of late-onset sepsis associated with outbreaks in neonatal intensive care units (NICUs) whose transmission is incompletely understood. Demographic and clinical data for 45 neonates with and 90 with other coagulase-negative staphylococci (CoNS) isolated from sterile sites were reviewed, and clinical significance was determined. isolated from 27 neonates at 2 hospitals between 2017 and 2022 underwent long-read (ONT) (=27) and short-read (Illumina) sequencing (=18).

View Article and Find Full Text PDF

Similar Publications

Nanopore Data-Driven Chromosome-Level Assembly of Flax Genome.

Plants (Basel)

December 2024

Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia.

Alexander A Arkhipov Elena N Pushkova Nadezhda L Bolsheva Tatiana A Rozhmina Elena V Borkhert

Flax is an important crop grown for seed and fiber. Flax chromosome number is 2n = 30, and its genome size is about 450-480 Mb. To date, the genomes of several flax varieties have been sequenced and assembled.

View Article and Find Full Text PDF

Similar Publications

Transposable Elements Contribute to the Regulation of Long Noncoding RNAs in .

Insects

November 2024

College of Life Science, Hebei University, Baoding 071002, China.

Yuli Gan Lingyan Wang Guoxian Liu Xiruo Guo Yiming Zhou

: Transposable elements (TEs) and noncoding sequences are major components of the genome, yet their functional contributions to long noncoding RNAs (lncRNAs) are not well understood. Although many lncRNAs originating from TEs (TE-lncRNAs) have been identified across various organisms, their characteristics and regulatory roles, particularly in insects, remain largely unexplored. This study integrated multi-omics data to investigate TE-lncRNAs in , focusing on the influence of transposons across different omics levels.

View Article and Find Full Text PDF

Similar Publications

LncSL: A Novel Stacked Ensemble Computing Tool for Subcellular Localization of lncRNA by Amino Acid-Enhanced Features and Two-Stage Automated Selection Strategy.

Int J Mol Sci

December 2024

School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China.

Lun Zhu Hong Chen Sen Yang

Long non-coding RNA (lncRNA) is a non-coding RNA longer than 200 nucleotides, crucial for functions like cell cycle regulation and gene transcription. Accurate localization prediction from sequence information is vital for understanding lncRNA's biological roles. Computational methods offer an effective alternative to traditional experimental methods for annotating lncRNA subcellular positions.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!