A greedy stacking algorithm for model ensembling and domain weighting.

Christoph F Kurz Werner Maier Christian Rink

BMC Res Notes

MAN Truck & Bus AG Munich, Elisabeth-Selbert-Strasse 1, 80939, München, Germany.

Published: February 2020

Objective: Because it is impossible to know which statistical learning algorithm performs best on a prediction task, it is common to use stacking methods to ensemble individual learners into a more powerful single learner. Stacking algorithms are usually based on linear models, which may run into problems, especially when predictions are highly correlated. In this study, we develop a greedy algorithm for model stacking that overcomes this issue while still being very fast and easy to interpret. We evaluate our greedy algorithm on 7 different data sets from various biomedical disciplines and compare it to linear stacking, genetic algorithm stacking and a brute force approach in different prediction settings. We further apply this algorithm on a task to optimize the weighting of the single domains (e.g., income, education) that build the German Index of Multiple Deprivation (GIMD) to be highly correlated with mortality.

Results: The greedy stacking algorithm provides good ensemble weights and outperforms the linear stacker in many tasks. Still, the brute force approach is slightly superior, but is computationally expensive. The greedy weighting algorithm has a variety of possible applications and is fast and efficient. A python implementation is provided.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7017540	PMC
http://dx.doi.org/10.1186/s13104-020-4931-7	DOI Listing

Publication Analysis

Top Keywords

greedy stacking

algorithm

stacking algorithm

algorithm model

highly correlated

greedy algorithm

brute force

force approach

stacking

greedy

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!