Assembling contigs in draft genomes using reversals and block-interchanges.

BMC Bioinformatics

Department of Computer Science, National Tsing Hua University, Hsinchu 30013, Taiwan.

Published: December 2013

The techniques of next generation sequencing allow an increasing number of draft genomes to be produced rapidly in a decreasing cost. However, these draft genomes usually are just partially sequenced as collections of unassembled contigs, which cannot be used directly by currently existing algorithms for studying their genome rearrangements and phylogeny reconstruction. In this work, we study the one-sided block (or contig) ordering problem with weighted reversal and block-interchange distance. Given a partially assembled genome π and a completely assembled genome σ, the problem is to find an optimal ordering to assemble (i.e., order and orient) the contigs of π such that the rearrangement distance measured by reversals and block-interchanges (also called generalized transpositions) with the weight ratio 1:2 between the assembled contigs of π and σ is minimized. In addition to genome rearrangements and phylogeny reconstruction, the one-sided block ordering problem particularly has a useful application in genome resequencing, because its algorithms can be used to assemble the contigs of a draft genome π based on a reference genome σ. By using permutation groups, we design an efficient algorithm to solve this one-sided block ordering problem in Oδn time, where n is the number of genes or markers and δ is the number of used reversals and block-interchanges. We also show that the assembly of the partially assembled genome can be done in On time and its weighted rearrangement distance from the completely assembled genome can be calculated in advance in On time. Finally, we have implemented our algorithm into a program and used some simulated datasets to compare its accuracy performance to a currently existing similar tool, called SIS that was implemented by a heuristic algorithm that considers only reversals, on assembling the contigs in draft genomes based on their reference genomes. Our experimental results have shown that the accuracy performance of our program is better than that of SIS, when the number of reversals and transpositions involved in the rearrangement events between the complete genomes of π and σ is increased. In particular, if there are more transpositions involved in the rearrangement events, then the gap of accuracy performance between our program and SIS is increasing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3622642PMC
http://dx.doi.org/10.1186/1471-2105-14-S5-S9DOI Listing

Publication Analysis

Top Keywords

draft genomes
16
assembled genome
16
contigs draft
12
reversals block-interchanges
12
one-sided block
12
ordering problem
12
accuracy performance
12
genome
9
assembling contigs
8
currently existing
8

Similar Publications

Development of Synthetic Antimicrobial Peptides Based on Genomic Analysis of Streptococcus salivarius.

J Clin Lab Anal

January 2025

Department of Research Analytics, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, India.

Background: In the oral environment, the production of bacteriocins or antimicrobial peptides (AMPs) plays a crucial role in maintaining ecological balance by impeding the proliferation of closely related microorganisms. This study aims to conduct in silico genome screening of Streptococcus salivarius to identify potential antimicrobial compounds existing as hypothetical peptides, with the goal of developing novel synthetic antimicrobial peptides.

Methods: Draft genomes of various oral Streptococcus salivarius strains were obtained from the NCBI database and subjected to analysis using bioinformatic tools, viz.

View Article and Find Full Text PDF

Aulacorthum solani is a worldwide agricultural pest aphid capable of feeding on a wide range of host plants. This insect is a vector of plant viruses and causes injury to crops including stunted growth from the loss of phloem. We found that the publicly available genome for A.

View Article and Find Full Text PDF

16S rRNA genes sequencing has been used for routine species identification and phylogenetic studies of bacteria. However, the high sequence similarity between some species and heterogeneity within copies at the intragenomic level could be a limiting factor of discriminatory ability. In this study, we aimed to compare 16S rRNA genes sequences and genome-based analysis (core SNPs and ANI) for identification of non-pathogenic .

View Article and Find Full Text PDF

Cotton2035: From Genomics Research to Optimized Breeding.

Mol Plant

January 2025

College of Life Sciences, Wuhan University, Wuhan 430072, China; Institute for Advanced Studies, Wuhan University, Wuhan 430072, China; Hubei Hongshan Laboratory, Wuhan 430072, China; TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan 430072, China. Electronic address:

Cotton is the world's most important natural fiber crop and serves as an ideal model for studying plant genome evolution, cell differentiation, elongation, and cell wall biosynthesis. The first draft of the cotton genome for Gossypium raimondii, completed in 2012, marked the beginning of global efforts in cotton genomics. Over the past decade, the cotton research community has continued to assemble and refine genomes for both wild and cultivated Gossypium species.

View Article and Find Full Text PDF

The incidence of gastroesophageal cancers is rising, driven, in part, by an increasing burden of risk factors of obesity and gastroesophageal reflux. Despite efforts to address these risk factors, and a growing interest in methods of population screening, the bulk of these tumours are unresectable at diagnosis. In this setting, effective systemic treatments are paramount to improve survival and quality of life.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!