ScientificWorldJournal
September 2013
A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete.
View Article and Find Full Text PDFInt J Data Min Bioinform
April 2013
The splice sites are essential for pre-mRNA maturation and crucial for Splice Site Modelling (SSM); however, there are gaps between the splicing signals and the computationally identified sequence features. In this paper, the Locality Sensitive Features (LSFs) are proposed to reduce the gaps by homogenising their contexts. Under the skewness-kurtosis based statistics and data analysis, SSM attributed with LSFs is fulfilled by double-boundary outlier filters.
View Article and Find Full Text PDFCurrent computational predictions of splice sites largely depend on the sequence patterns of known intronic sequence features (ISFs) described in the classical intron definition model (IDM). The computation-oriented IDM (CO-IDM) clearly provides more specific and concrete information for describing intron flanks of splice sites (IFSSs). In the paper, we proposed a novel approach of fuzzy decision trees (FDTs) which utilize (1) weighted ISFs of twelve uni-frame patterns (UFPs) and forty-five multi-frame patterns (MFPs) and (2) gain ratios to improve the performances in identifying an intron.
View Article and Find Full Text PDFInt J Data Min Bioinform
June 2009
Core Promoter Elements (CPEs) were key players in transcription initiation. Identifying CPEs is crucial for understanding gene expression. In this paper, a framework for finding new CPEs was proposed.
View Article and Find Full Text PDF