Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell types, in order to create a low dimensional global map of human gene expression. Here, we reevaluate this approach and show that the linear intrinsic dimensionality of this global map is higher than previously reported. Furthermore, we analyze in which cases PCA fails to detect biologically relevant information and point the reader to methods that overcome these limitations. Our results refine the current understanding of the overall structure of gene expression spaces and show that PCA critically depends on the effect size of the biological signal as well as on the fraction of samples containing this signal.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890592PMC
http://dx.doi.org/10.1038/srep25696DOI Listing

Publication Analysis

Top Keywords

gene expression
16
principal components
8
components analysis
8
intrinsic dimensionality
8
expression microarray
8
microarray data
8
global map
8
analysis reported
4
reported low
4
low intrinsic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!