We consider variable selection when missing values are present in the predictor variables. We compare using complete cases with multiple imputation using backward selection (backwards stepping) and least angle regression. These are studied using a data set from a rheumatological disease (myositis). We find that the coefficients are slightly different and the estimated standard errors are smaller in the complete cases (not a surprise). This seems to be due to the fact that because the estimated residual variance is small the complete cases are more homogeneous than the full data cases.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1177/0962280209358003 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!