An efficient motif discovery algorithm with unknown motif length and number of binding sites.

Int J Data Min Bioinform

Department of Computer Science, The University of Hong Kong, Hong Kong, China.

Published: May 2008

Most motif discovery algorithms from DNA sequences require the motif's length as input. Styczynski et al. introduced the Extended (l,d)-Motif Problem (EMP) where the motif's length is not an input parameter. Unfortunately, their algorithm takes an unacceptably long time to run, e.g. over 3 months to discover a length-14 motif. Since the best motif may not be the longest nor have the largest number of binding sites, in this paper we further eliminate another input parameter about the minimum number of binding sites in order to provide more realistic/robust results. We also develop an efficient algorithm to solve EMP and this redefined problem.

Download full-text PDF

Source
http://dx.doi.org/10.1504/ijdmb.2006.010856DOI Listing

Publication Analysis

Top Keywords

number binding
12
binding sites
12
motif discovery
8
motif's length
8
length input
8
input parameter
8
efficient motif
4
discovery algorithm
4
algorithm unknown
4
motif
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!