Background: The alignment of short reads generated by next-generation sequencers to genomes is an important problem in many biomedical and bioinformatics applications. Although many proposed methods work very well on narrow ranges of read lengths, they tend to suffer in performance and alignment quality for reads outside of these ranges.

Results: We introduce RandAL, a novel method that aligns DNA sequences to reference genomes. Our approach utilizes two FM indices to facilitate efficient bidirectional searching, a pruning heuristic to speed up the computing of edit distances, and most importantly, a randomized strategy that enables effective estimation of key parameters. Extensive comparisons showed that RandAL outperformed popular aligners in most instances and was unique in its consistent and accurate performance over a wide range of read lengths and error rates. The software package is publicly available at https://github.com/namsyvo/RandAL.

Conclusions: RandAL promises to align effectively and accurately short reads that come from a variety of technologies with different read lengths and rates of sequencing error.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4120144PMC
http://dx.doi.org/10.1186/1471-2164-15-S5-S2DOI Listing

Publication Analysis

Top Keywords

read lengths
12
dna sequences
8
sequences reference
8
reference genomes
8
short reads
8
randal
4
randal randomized
4
randomized approach
4
approach aligning
4
aligning dna
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!