Mining subspace clusters from DNA microarray data using large itemset techniques.

Ye-In Chang Jiun-Rung Chen Yueh-Chi Tsai

J Comput Biol

Department of Computer Science and Engineering, National Sun Yat-Sen University, Taiwan, R.O.C.

Published: May 2009

Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.

Download full-text PDF	Source
http://dx.doi.org/10.1089/cmb.2008.0161	DOI Listing

Publication Analysis

Top Keywords

subspace clusters

mining subspace

gene sets

clusters dna

dna microarray

previous proposed

proposed algorithms

mdss genes

condition-pair mdss

mdss

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!