Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality. Virseqimprover first examines whether there is any chimeric sequence based on read coverage, breaks the sequence into segments if there is, then extends the longest segment with uniform depth of coverage, and repeats these procedures until the sequence cannot be extended. Finally, Virseqimprover annotates the gene content of the resulting sequence. Results show that Virseqimprover has good performances on correcting and extending viral contigs to their full lengths, hence can be a useful tool to improve the completeness and minimize the assembly errors of viral contigs. Both a web server and a conda package for Virseqimprover are provided to the research community free of charge.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727651PMC
http://dx.doi.org/10.7717/peerj.18515DOI Listing

Publication Analysis

Top Keywords

viral contigs
12
virseqimprover
6
viral
5
virseqimprover integrated
4
integrated pipeline
4
pipeline viral
4
viral contig
4
contig error
4
error correction
4
correction extension
4

Similar Publications

Background: Mosquito-borne diseases have a significant public health threat worldwide, with arboviruses accounting for a high proportion of infectious diseases and mortality annually. Brazil, in particular, has been suffering outbreaks of diseases transmitted by mosquito viruses, notably those of the genus, such as dengue, Zika, and chikungunya. Against this background, the São Paulo Zoo is an intriguing ecological niche to explore the virome of mosquitoes, potentially shedding light on the dynamics of arbovirus transmission within a confined setting.

View Article and Find Full Text PDF

Expanding the genomic diversity of human anelloviruses.

Virus Evol

January 2025

MRC-University of Glasgow Centre for Virus Research, The University of Glasgow, Glasgow G61 1QH, United Kingdom.

Anelloviruses are a group of small, circular, single-stranded DNA viruses that are found ubiquitously across mammalian hosts. Here, we explored a large number of publicly available human microbiome datasets and retrieved a total of 829 anellovirus genomes, substantially expanding the known diversity of these viruses. The majority of new genomes fall within the three major human anellovirus genera: , and , while we also present new genomes of the under-sampled , and genera.

View Article and Find Full Text PDF

Wild rodents serve as crucial reservoirs for zoonotic viruses. Anthropogenic and environmental disruptions, particularly those induced by mining activities, can destabilize rodent populations and facilitate the emergence of viral agents. In the Canaã dos Carajás and Curionópolis regions of Brazil, significant environmental changes have occurred due to mining expansion, potentially creating conditions conducive to the emergence of rodent-associated viral diseases.

View Article and Find Full Text PDF

Draft genome dataset of strain R-35 isolated from tidal pool sediments.

Data Brief

February 2025

Applied Microbial and Health Biotechnology Institute, Cape Peninsula University of Technology, PO Box 1906, Bellville, Cape Town, 7530, South Africa.

The marine isolate, strain R-35, was isolated from marine sediments collected from the Glencairn Tidal Pool, Table Mountain National Park, Cape Town, South Africa. The genomic DNA was sequenced using the Ion Torrent GeneStudioâ„¢ S5 platform, and the assembly was performed using the SPAdes assembler on the Centre for High Performance Computing (CHPC) Lengau Cluster located at the CSIR, Rosebank, South Africa. The draft genome assembly consisted of 722 contigs totaling 7,625,174 base pairs and a G+C% content of 72.

View Article and Find Full Text PDF

Despite the recent surge of viral metagenomic studies, it remains a significant challenge to recover complete virus genomes from metagenomic data. The majority of viral contigs generated from de novo assembly programs are highly fragmented, presenting significant challenges to downstream analysis and inference. To address this issue, we have developed Virseqimprover, a computational pipeline that can extend assembled contigs to complete or nearly complete genomes while maintaining extension quality.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!