Integration of clinical and gene expression data has a synergetic effect on predicting breast cancer outcome.

PLoS One

Delft Bioinformatics Laboratory, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg, Delft, The Netherlands.

Published: March 2013

Breast cancer outcome can be predicted using models derived from gene expression data or clinical data. Only a few studies have created a single prediction model using both gene expression and clinical data. These studies often remain inconclusive regarding an obtained improvement in prediction performance. We rigorously compare three different integration strategies (early, intermediate, and late integration) as well as classifiers employing no integration (only one data type) using five classifiers of varying complexity. We perform our analysis on a set of 295 breast cancer samples, for which gene expression data and an extensive set of clinical parameters are available as well as four breast cancer datasets containing 521 samples that we used as independent validation.mOn the 295 samples, a nearest mean classifier employing a logical OR operation (late integration) on clinical and expression classifiers significantly outperforms all other classifiers. Moreover, regardless of the integration strategy, the nearest mean classifier achieves the best performance. All five classifiers achieve their best performance when integrating clinical and expression data. Repeating the experiments using the 521 samples from the four independent validation datasets also indicated a significant performance improvement when integrating clinical and gene expression data. Whether integration also improves performances on other datasets (e.g. other tumor types) has not been investigated, but seems worthwhile pursuing. Our work suggests that future models for predicting breast cancer outcome should exploit both data types by employing a late OR or intermediate integration strategy based on nearest mean classifiers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3394805PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0040358PLOS

Publication Analysis

Top Keywords

gene expression
20
expression data
20
breast cancer
20
cancer outcome
12
data
9
integration
8
integration clinical
8
clinical gene
8
predicting breast
8
clinical data
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!