Summary: Our aim is to improve omics based prediction and feature selection using multiple sources of auxiliary information: co-data. Adaptive group regularized ridge regression (GRridge) was proposed to achieve this by estimating additional group-based penalty parameters through an empirical Bayes method at a low computational cost. We illustrate the GRridge method and software on RNA sequencing datasets. The method boosts the performance of an ordinary ridge regression and outperforms other classifiers. Post-hoc feature selection maintains the predictive ability of the classifier with far fewer markers.
Availability And Implementation: GRridge is an R package that includes a vignette. It is freely available at ( https://bioconductor.org/packages/GRridge/ ). All information and R scripts used in this study, including those on retrieval and processing of the co-data, are available from http://github.com/markvdwiel/GRridgeCodata .
Contact: mark.vdwiel@vumc.nl.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btw837 | DOI Listing |
BMC Bioinformatics
December 2017
Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, 1007 MB, The Netherlands.
Background: Prediction in high dimensional settings is difficult due to the large number of variables relative to the sample size. We demonstrate how auxiliary 'co-data' can be used to improve the performance of a Random Forest in such a setting.
Results: Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities.
Bioinformatics
May 2017
Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands.
Summary: Our aim is to improve omics based prediction and feature selection using multiple sources of auxiliary information: co-data. Adaptive group regularized ridge regression (GRridge) was proposed to achieve this by estimating additional group-based penalty parameters through an empirical Bayes method at a low computational cost. We illustrate the GRridge method and software on RNA sequencing datasets.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!