In contrast to conventional data mining, which searches for specific subsets of genes (extensive variables) to correlate with specific phenotypes, DATE analysis correlates intensive state variables calculated from the same datasets. At the heart of DATE analysis are two biological equations of state not dependent on genetic pathways. This result distinguishes DATE analysis from other bioinformatics approaches. The dimensionless state variable F quantifies the relative overall cellular activity of test cells compared to well-chosen reference cells. The variable pi(i) is the fold-change in the expression of the ith gene of test cells relative to reference. It is the fraction phi of the genome undergoing differential expression-not the magnitude pi-that controls biological change. The state variable phi is equivalent to the control strength of metabolic control analysis. For tractability, DATE analysis assumes a linear system of enzyme-connected networks and exploits the small average contribution of each cellular component. This approach was validated by reproducible values of the state variables F, RNA index, and phi calculated from random subsets of transcript microarray data. Using published microarray data, F, RNA index, and phi were correlated with: (1) the blood-feeding cycle of the malaria parasite, (2) embryonic development of the fruit fly, (3) temperature adaptation of Killifish, (4) exponential growth of cultured S. pneumoniae, and (5) human cancers. DATE analysis was applied to aCGH data from the great apes. A good example of the power of DATE analysis is its application to genomically unstable cancers, which have been refractory to data mining strategies.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/btpr.239 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!