Introduction: Variable selection is a common technique to identify the most predictive variables from a pool of candidate predictors. Low prevalence predictors (LPPs) are frequently found in clinical data, yet few studies have explored their impact on model performance during variable selection. This study compared the Random Forest (RF) algorithm and stepwise regression (SWR) for variable selection using data from a paediatric sepsis screening tool, where 18 out of 32 predictors had a prevalence < 10%.
View Article and Find Full Text PDF