A Robust Supervised Variable Selection for Noisy High-Dimensional Data.

Biomed Res Int

Institute of Computer Science of the Czech Academy of Sciences, Pod Vodárenskou Vĕží 2, 182 07 Prague 8, Czech Republic ; Department of Biomedical Informatics, Faculty of Biomedical Engineering, Czech Technical University in Prague, Náměstí Sítná 3105, 272 01 Kladno, Czech Republic.

Published: March 2016

The Minimum Redundancy Maximum Relevance (MRMR) approach to supervised variable selection represents a successful methodology for dimensionality reduction, which is suitable for high-dimensional data observed in two or more different groups. Various available versions of the MRMR approach have been designed to search for variables with the largest relevance for a classification task while controlling for redundancy of the selected set of variables. However, usual relevance and redundancy criteria have the disadvantages of being too sensitive to the presence of outlying measurements and/or being inefficient. We propose a novel approach called Minimum Regularized Redundancy Maximum Robust Relevance (MRRMRR), suitable for noisy high-dimensional data observed in two groups. It combines principles of regularization and robust statistics. Particularly, redundancy is measured by a new regularized version of the coefficient of multiple correlation and relevance is measured by a highly robust correlation coefficient based on the least weighted squares regression with data-adaptive weights. We compare various dimensionality reduction methods on three real data sets. To investigate the influence of noise or outliers on the data, we perform the computations also for data artificially contaminated by severe noise of various forms. The experimental results confirm the robustness of the method with respect to outliers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468284PMC
http://dx.doi.org/10.1155/2015/320385DOI Listing

Publication Analysis

Top Keywords

high-dimensional data
12
supervised variable
8
variable selection
8
noisy high-dimensional
8
redundancy maximum
8
mrmr approach
8
dimensionality reduction
8
data observed
8
observed groups
8
data
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!