Unlabelled: Chromosomes or other long DNA sequences contain many highly similar repeated sub-sequences. While there are efficient methods for detecting strict repeats or detecting already characterized repeats, there is no software available for detecting approximate repeats in large DNA sequences allowing for weighted substitutions and indels in a coherent statistical framework. Here, we present an implementation of a two-steps method (seed detection followed by their extension) that detects those approximate repeats. Our method is computationally efficient enough to handle large sequences and is flexible enough to account for influencing factors, such as sequence-composition biases both at the seed detection and alignment levels.

Availability: http://wwwabi.snv.jussieu.fr/public/RepSeek/

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btl519DOI Listing

Publication Analysis

Top Keywords

approximate repeats
12
dna sequences
12
repeats large
8
large dna
8
seed detection
8
repeats
5
repseek tool
4
tool retrieve
4
retrieve approximate
4
sequences
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!