Recovering genetic regulatory networks from micro-array data and location analysis data.

Genome Inform

LTI, School of Computer Science, Carnegie Mellon Univ., 4502 Newell Simon Hall, 5000 Forbes Ave, Pittsburgh, PA 15213, USA.

Published: January 2006

Learning large network (with hundreds of variables) is gaining interest of many researchers with the emergence of high-throughput biological data sources such as micro-array data. In this paper, we investigated the two popular large scale network structure learning algorithms, sparse candidate hill climbing (SCHC) and Grow-Shrinkage(GS) algorithm. The experiments show that in fact both of them have serious effectiveness problems when the number of variables(genes) is large compared to the number of instances(experimental conditions), which is a common case in micro-array data. We further propose a new large scale structure learning algorithm based on Lasso regression. Theoretical analysis in [10] suggested that the L1-norm in lasso regression could make our algorithm especially suitable in the cases that the number of variables and instances is unbalance. Our algorithm achieves much better results than SCHC and GS on the synthetic data. We also show the effectiveness of our algorithm by learning genetic regulatory network modules from a real micro-array data (with more than 6000 genes), combined with the genome-wide location analysis data. The learned results are consistent well with biological knowledge.

Download full-text PDF

Source

Publication Analysis

Top Keywords

micro-array data
16
genetic regulatory
8
data
8
location analysis
8
analysis data
8
large scale
8
structure learning
8
lasso regression
8
algorithm
5
recovering genetic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!