Assessing the significance of consistently mis-regulated genes in cancer associated gene expression matrices.

Mattias Wahde Gregory T Klus Michael L Bittner Yidong Chen Zoltan Szallasi

Bioinformatics

Division of Mechatronics, Chalmers University of Technology, Göteborg, Sweden.

Published: March 2002

Motivation: The simplest level of statistical analysis of cancer associated gene expression matrices is aimed at finding consistently up- or down-regulated genes within a given set of tumor samples. Considering the high level of gene expression diversity detected in cancer, one needs to assess the probability that the consistent mis-regulation of a given gene is due to chance. Furthermore, it is important to determine the required sample number that will ensure the meaningful statistical analysis of massively parallel gene expression measurements.

Results: The probability of consistent mis-regulation is calculated in this paper for binarized gene expression data, using combinatorial considerations. For practical purposes, we also provide a set of accurate approximate formulas for determining the same probability in a computationally less intensive way. When the pool of mis-regulatable genes is restricted, the probability of consistent mis-regulation can be overestimated. We show, however, that this effect has little practical consequences for cancer associated gene expression measurements published in the literature. Finally, in order to aid experimental design, we have provided estimates on the required sample number that will ensure that the detected consistent mis-regulation is not due to chance. Our results suggest that less than 20 sufficiently diverse tumor samples may be enough to identify consistently mis-regulated genes in a statistically significant manner.

Availability: An implementation using Mathematica (tm) of the main equation of the paper, (4), is available at www.me.chalmers.se/~mwahde/bioinfo.html.

Download full-text PDF	Source
http://dx.doi.org/10.1093/bioinformatics/18.3.389	DOI Listing

Publication Analysis

Top Keywords

gene expression

consistent mis-regulation

cancer associated

associated gene

probability consistent

consistently mis-regulated

mis-regulated genes

expression matrices

statistical analysis

tumor samples

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!