The properties of high-dimensional data spaces: implications for exploring gene and protein expression data.

Nat Rev Cancer

Department of Oncology and Lombardi Comprehensive Cancer Center, Georgetown University School of Medicine, 3970 Reservoir Road NW, Washington, DC 20057, USA.

Published: January 2008

High-throughput genomic and proteomic technologies are widely used in cancer research to build better predictive models of diagnosis, prognosis and therapy, to identify and characterize key signalling networks and to find new targets for drug development. These technologies present investigators with the task of extracting meaningful statistical and biological information from high-dimensional data spaces, wherein each sample is defined by hundreds or thousands of measurements, usually concurrently obtained. The properties of high dimensionality are often poorly understood or overlooked in data modelling and analysis. From the perspective of translational science, this Review discusses the properties of high-dimensional data spaces that arise in genomic and proteomic studies and the challenges they can pose for data analysis and interpretation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2238676PMC
http://dx.doi.org/10.1038/nrc2294DOI Listing

Publication Analysis

Top Keywords

high-dimensional data
12
data spaces
12
properties high-dimensional
8
genomic proteomic
8
data
6
spaces implications
4
implications exploring
4
exploring gene
4
gene protein
4
protein expression
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!