Finding biomarker signatures in pooled sample designs: a simulation framework for methodological comparisons.

Anna Telaar Gerd Nürnberg Dirk Repsilber

Adv Bioinformatics

Genetics and Biometry, Leibniz Institute for Farm Animal Biology, Wilhelm-Stahl-Allee 2, D-18196 Dummerstorf, Germany.

Published: July 2011

Detection of discriminating patterns in gene expression data can be accomplished by using various methods of statistical learning. It has been proposed that sample pooling in this context would have negative effects; however, pooling cannot always be avoided. We propose a simulation framework to explicitly investigate the parameters of patterns, experimental design, noise, and choice of method in order to find out which effects on classification performance are to be expected. We use a two-group classification task and simulated gene expression data with independent differentially expressed genes as well as bivariate linear patterns and the combination of both. Our results show a clear increase of prediction error with pool size. For pooled training sets powered partial least squares discriminant analysis outperforms discriminance analysis, random forests, and support vector machines with linear or radial kernel for two of three simulated scenarios. The proposed simulation approach can be implemented to systematically investigate a number of additional scenarios of practical interest.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2909718	PMC
http://dx.doi.org/10.1155/2010/318573	DOI Listing

Publication Analysis

Top Keywords

simulation framework

gene expression

expression data

finding biomarker

biomarker signatures

signatures pooled

pooled sample

sample designs

designs simulation

framework methodological

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!