A preprocessor for shotgun assembly of large genomes.

J Comput Biol

Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742-2431, USA.

Published: January 2005

The whole-genome shotgun (WGS) assembly technique has been remarkably successful in efforts to determine the sequence of bases that make up a genome. WGS assembly begins with a large collection of short fragments that have been selected at random from a genome. The sequence of bases at each end of the fragment is determined, albeit imprecisely, resulting in a sequence of letters called a "read." Each letter in a read is assigned a quality value, which estimates the probability that a sequencing error occurred in determining that letter. Reads are typically cut off after about 500 letters, where sequencing errors become endemic. We report on a set of procedures that (1) corrects most of the sequencing errors, (2) changes quality values accordingly, and (3) produces a list of "overlaps," i.e., pairs of reads that plausibly come from overlapping parts of the genome. Our procedures, which we call collectively the "UMD Overlapper," can be run iteratively and as a preprocessor for other assemblers. We tested the UMD Overlapper on Celera's Drosophila reads. When we replaced Celera's overlap procedures in the front end of their assembler, it was able to produce a significantly improved genome.

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2004.11.734DOI Listing

Publication Analysis

Top Keywords

assembly large
8
wgs assembly
8
sequence bases
8
sequencing errors
8
preprocessor shotgun
4
shotgun assembly
4
large genomes
4
genomes whole-genome
4
whole-genome shotgun
4
shotgun wgs
4

Similar Publications

Solid-state nanopore is a promising single molecular detection technique, but is largely limited by relatively low resolution to small-size targets and laborious design of signaling probes. Here we establish a universal, CRISPR/Cas-Assisted Nanopore Operational Nexus (CANON), which can accurately transduce different targeting sources/species into different DNA structural probes via a "Signal-ON" mode. Target recognition activates the cleavage activity of a Cas12a/crRNA system and then completely digest the blocker of an initiator.

View Article and Find Full Text PDF

Nanostructuring surfaces is an emergent strategy to endow materials with abilities to combat pathogenic bacteria. Nevertheless, it remains challenging to create nanospike structures on the curved surfaces of polymer materials, including gauze and other microfibrous medical materials. Additionally, the effects of nanostructured surfaces on bacteria in the presence of proteins and in vivo remain largely unexplored.

View Article and Find Full Text PDF

Cyclic oligonucleotide-based antiviral signaling systems (CBASS) are bacterial anti-phage defense operons that use nucleotide signals to control immune activation. Here we biochemically screen 57 diverse and phages for the ability to disrupt CBASS immunity and discover anti-CBASS 4 (Acb4) from the phage SPO1 as the founding member of a large family of >1,300 immune evasion proteins. A 2.

View Article and Find Full Text PDF

Aligning genomes into common coordinates is central to pangenome analysis and construction, but it is also computationally expensive. Multi-sequence maximal unique matches (multi-MUMs) are guideposts for core genome alignments, helping to frame and solve the multiple alignment problem. We introduce Mumemto, a tool that computes multi-MUMs and other match types across large pangenomes.

View Article and Find Full Text PDF

Electrochemical CO reduction offers a promising method of converting renewable electrical energy into valuable hydrocarbon compounds vital to hard-to-abate sectors. Significant progress has been made on the lab scale, but scale-up demonstrations remain limited. Because of the low energy efficiency of CO reduction, we suspect that significant thermal gradients may develop in industrially relevant dimensions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!