Improving bacterial genome assembly using a test of strand orientation.

Bioinformatics

Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign.

Published: September 2022

Summary: The complexity of genome assembly is due in large part to the presence of repeats. In particular, large reverse-complemented repeats can lead to incorrect inversions of large segments of the genome. To detect and correct such inversions in finished bacterial genomes, we propose a statistical test based on tetranucleotide frequency (TNF), which determines whether two segments from the same genome are of the same or opposite orientation. In most cases, the test neatly partitions the genome into two segments of roughly equal length with seemingly opposite orientations. This corresponds to the segments between the DNA replication origin and terminus, which were previously known to have distinct nucleotide compositions. We show that, in several cases where this balanced partition is not observed, the test identifies a potential inverted misassembly, which is validated by the presence of a reverse-complemented repeat at the boundaries of the inversion. After inverting the sequence between the repeat, the balance of the misassembled genome is restored. Our method identifies 31 potential misassemblies in the NCBI database, several of which are further supported by a reassembly of the read data.

Availability And Implementation: A github repository is available at https://github.com/gcgreenberg/Oriented-TNF.git.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btac516DOI Listing

Publication Analysis

Top Keywords

genome assembly
8
segments genome
8
identifies potential
8
genome
6
improving bacterial
4
bacterial genome
4
test
4
assembly test
4
test strand
4
strand orientation
4

Similar Publications

Hox genes play a pivotal role during development. Their expression is tightly controlled in a spatiotemporal manner, ensuring that specific body structures develop at the correct locations and times during development. Various genomics approaches have been used to capture temporal and dynamic regulation of Hox gene expression at the nucleosome/chromatin level.

View Article and Find Full Text PDF

Chikungunya virus (CHIKV), an enveloped positive-sense RNA virus, is a member of the alphaviruses and cause fever and arthralgia in humans. We performed genome-wide CRISPR/Cas9-based screens and identified Y-box binding protein 1 (YBX1) as an essential cellular factor for CHIKV. Deficiency of YBX1 inhibited CHIKV RNA replication and impaired virus production.

View Article and Find Full Text PDF

SARS-CoV-2 Envelope (E) protein is critical in viral assembly, release, and virulence. E gene was considered highly conserved and evolving slowly. Pan-sarbecoviruses-conserved regions in the E gene have been used as targets for various RT-PCR assays to detect SARS-CoV-2.

View Article and Find Full Text PDF

Mobile genetic elements are key to the global emergence of antibiotic resistance. We successfully reconstructed the complete bacterial genome and plasmid assemblies of isolates sharing the same carbapenemase gene to understand evolution over time in six confined hospital drains over five years. From 82 isolates we identified 14 unique strains from 10 species with 113 carrying plasmids across 16 distinct replicon types.

View Article and Find Full Text PDF

RNA viruses possess small genomes encoding a limited repertoire of essential and often multifunctional proteins. Although genetically tagging viral proteins provides a powerful tool for dissecting mechanisms of viral replication and infection, it remains a challenge. Here, we leverage genetic code expansion to develop a recoded strain of respiratory syncytial virus (RSV) in which the multifunctional nucleoprotein is site-specifically modified with a noncanonical amino acid.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!