Regression modelling is a powerful statistical tool often used in biomedical and clinical research. It could be formulated as an inverse problem that measures the discrepancy between the target outcome and the data produced by representation of the modelled predictors. This approach could simultaneously perform variable selection and coefficient estimation. We focus particularly on a linear regression issue, , where is the parameter of interest and its components are the regression coefficients. The inverse problem finds an estimate for the parameter , which is mapped by the linear operator to the observed outcome data . This problem could be conveyed by finding a solution in the affine subspace . However, in the presence of collinearity, high-dimensional data and high conditioning number of the related covariance matrix, the solution may not be unique, so the introduction of prior information to reduce the subset and regularize the inverse problem is needed. Informed by Huber's robust statistics framework, we propose an optimal regularizer to the regression problem. We compare results of the proposed method and other penalized regression regularization methods: ridge, lasso, adaptive-lasso and elastic-net under different strong hypothesis such as high conditioning number of the covariance matrix and high error amplitude, on both simulated and real data from the South London Stroke Register. The proposed approach can be extended to mixed regression models. Our inverse problem framework coupled with robust statistics methodology offer new insights in statistical regression and learning. It could open a new research development for model fitting and learning.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/bimj.201900283 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!