Fast and accurate read alignment for resequencing.

Bioinformatics

Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA.

Published: September 2012

Motivation: Next-generation sequence analysis has become an important task both in laboratory and clinical settings. A key stage in the majority sequence analysis workflows, such as resequencing, is the alignment of genomic reads to a reference genome. The accurate alignment of reads with large indels is a computationally challenging task for researchers.

Results: We introduce SeqAlto as a new algorithm for read alignment. For reads longer than or equal to 100 bp, SeqAlto is up to 10 × faster than existing algorithms, while retaining high accuracy and the ability to align reads with large (up to 50 bp) indels. This improvement in efficiency is particularly important in the analysis of future sequencing data where the number of reads approaches many billions. Furthermore, SeqAlto uses less than 8 GB of memory to align against the human genome. SeqAlto is benchmarked against several existing tools with both real and simulated data.

Availability: Linux and Mac OS X binaries free for academic use are available at http://www.stanford.edu/group/wonglab/seqalto

Contact: whwong@stanford.edu.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436849PMC
http://dx.doi.org/10.1093/bioinformatics/bts450DOI Listing

Publication Analysis

Top Keywords

read alignment
8
sequence analysis
8
alignment reads
8
reads large
8
large indels
8
reads
5
fast accurate
4
accurate read
4
alignment
4
alignment resequencing
4

Similar Publications

Background: Perception-related errors comprise most diagnostic mistakes in radiology. To mitigate this problem, radiologists use personalized and high-dimensional visual search strategies, otherwise known as search patterns. Qualitative descriptions of these search patterns, which involve the physician verbalizing or annotating the order he or she analyzes the image, can be unreliable due to discrepancies in what is reported versus the actual visual patterns.

View Article and Find Full Text PDF

Maximizing the translational potential of neurophysiology in amyotrophic lateral sclerosis: a study on compound muscle action potentials.

Amyotroph Lateral Scler Frontotemporal Degener

January 2025

Department of Neuroscience, Sheffield Institute for Translational Neuroscience (SITraN), The University of Sheffield, Sheffield, UK and.

Mouse models of amyotrophic lateral sclerosis (ALS) enable testing of novel therapeutic interventions. However, treatments that have extended survival in mice have often failed to translate into human benefit in clinical trials. Compound muscle action potentials (CMAPs) are a simple neurophysiological test that measures the summation of muscle fiber depolarization in response to maximal stimulation of the innervating nerve.

View Article and Find Full Text PDF

Hydrothermal sediments host phylogenetically diverse and physiologically complex microbial communities. Previous studies of microbial community structure in hydrothermal sediments have typically used short-read sequencing approaches. To improve on these approaches, we use LoopSeq, a high-throughput synthetic long-read sequencing method that has yielded promising results in analyses of microbial ecosystems, such as the human gut microbiome.

View Article and Find Full Text PDF

Introduction: Genomic medicine has features that make it preference sensitive and amenable to model-based health economic evaluation. Preferences of patients, caregivers, and clinicians related to the uptake and delivery of genomic medicine technologies and services that are not captured in health state utility weights can affect the intervention's cost-effectiveness and budget impact. However, there is currently no established or agreed-on approach for integrating preference information into economic evaluations.

View Article and Find Full Text PDF

Anaerobic digestion is a crucial process in wastewater treatment, renowned for its sustainable biogas production capabilities and the simultaneous reduction of environmental pollution. However, dysregulation of vital biological processes and pathways can lead to reduced efficiency and suboptimal biogas output, which can be seen through low counts per million of sequences related to three critical control points for methane synthesis. Namely, tetrahydromethanopterin S-methyltransferase (MTR), methyl-coenzyme reductase M (MCR), and CoB/CoM heterodisulfide oxidoreductase (HDR) are the last reactions that must occur.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!