HMMPolish: a coding region polishing tool for TGS-sequenced RNA viruses.

Brief Bioinform

Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China.

Published: September 2023

Access to accurate viral genomes is important to downstream data analysis. Third-generation sequencing (TGS) has recently become a popular platform for virus sequencing because of its long read length. However, its per-base error rate, which is higher than next-generation sequencing, can lead to genomes with errors. Polishing tools are thus needed to correct errors either before or after sequence assembly. Despite promising results of available polishing tools, there is still room to improve the error correction performance to perform more accurate genome assembly. The errors, particularly those in coding regions, can hamper analysis such as linage identification and variant monitoring. In this work, we developed a novel pipeline, HMMPolish, for correcting (polishing) errors in protein-coding regions of known RNA viruses. This tool can be applied to either raw TGS reads or the assembled sequences of the target virus. By utilizing profile Hidden Markov Models of protein families/domains in known viruses, HMMPolish can correct errors that are ignored by available polishers. We extensively validated HMMPolish on 34 datasets that covered four clinically important viruses, including HIV-1, influenza-A, norovirus, and severe acute respiratory syndrome coronavirus 2. These datasets contain reads with different properties, such as sequencing depth and platforms (PacBio or Nanopore). The benchmark results against popular/representative polishers show that HMMPolish competes favorably on error correction in coding regions of known RNA viruses.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516367PMC
http://dx.doi.org/10.1093/bib/bbad264DOI Listing

Publication Analysis

Top Keywords

rna viruses
12
polishing tools
8
correct errors
8
error correction
8
coding regions
8
regions rna
8
hmmpolish
5
viruses
5
errors
5
hmmpolish coding
4

Similar Publications

West Nile Neuroinvasive Disease: Case Series and Review of the Literature.

Am J Ther

January 2025

Department of Medicine, Long Island Jewish Forest Hills (Northwell Health), Forest Hills, NY.

Background: West Nile virus (WNV), although underdiagnosed, is the most common mosquito-borne disease and the second most common cause of viral encephalitis in the United States. Fewer than 1% of those infected develop neuroinvasive disease.

Methods: We present a cluster of 3 cases of neuroinvasive WNV that occurred between August and September 2023 and a review of the literature for neurologic involvement with this virus.

View Article and Find Full Text PDF

Introduction: Dengue viruses (DENVs), the causative agents of dengue hemorrhagic fever and dengue shock syndrome, undergo genetic mutations that result in new strains and lead to ongoing global re-infections.

Objectives: To address the growing complexity of identifying and tracking biological samples, this study screened RNA barcode segments for the four DENV serotypes, ensuring high specificity and recall rates for DENV identification using segments.

Results: Through analyzing complete genome sequences of DENVs, we screened eight barcode segments for DENV, DENV-1, DENV-2, DENV-3, and DENV-4 identification.

View Article and Find Full Text PDF

Background: The COVID-19 pandemic heightened anxiety levels globally, disproportionately affecting low-and middle-income countries (LMICs). Physical activity (PA) has shown potential to alleviate mental health challenges, including anxiety. This study explores the relationship between PA and anxiety among Tanzanian adults during the pandemic, examining whether self-reported health status moderates this relationship and identifying demographic variations.

View Article and Find Full Text PDF

Unlabelled: Eastern equine encephalitis virus (EEEV) is an arthropod-borne, positive-sense RNA alphavirus posing a substantial threat to public health. Unlike similar viruses such as SARS-CoV-2, EEEV replicates efficiently in neurons, producing progeny viral particles as soon as 3-4 hours post-infection. EEEV infection, which can cause severe encephalitis with a human mortality rate surpassing 30%, has no licensed, targeted therapies, leaving patients to rely on supportive care.

View Article and Find Full Text PDF

Exoribonuclease-resistant RNAs (xrRNAs) are viral RNA structures that block degradation by cellular 5'-3' exoribonucleases to produce subgenomic viral RNAs during infection. Initially discovered in flaviviruses, xrRNAs have since been identified in wide range of RNA viruses, including those that infect plants. High sequence variability among viral xrRNAs raises questions about the shared molecular features that characterize this functional RNA class.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!