A greedy stacking algorithm for model ensembling and domain weighting-Reference-Cited by-同舟云学术

A greedy stacking algorithm for model ensembling and domain weighting

Published:2020-02-12 Issue:1 Volume:13 Page:
ISSN:1756-0500
Container-title:BMC Research Notes
language:en
Short-container-title:BMC Res Notes

Author:

Kurz Christoph F.^ORCID,Maier Werner,Rink Christian

Abstract

Abstract Objective Because it is impossible to know which statistical learning algorithm performs best on a prediction task, it is common to use stacking methods to ensemble individual learners into a more powerful single learner. Stacking algorithms are usually based on linear models, which may run into problems, especially when predictions are highly correlated. In this study, we develop a greedy algorithm for model stacking that overcomes this issue while still being very fast and easy to interpret. We evaluate our greedy algorithm on 7 different data sets from various biomedical disciplines and compare it to linear stacking, genetic algorithm stacking and a brute force approach in different prediction settings. We further apply this algorithm on a task to optimize the weighting of the single domains (e.g., income, education) that build the German Index of Multiple Deprivation (GIMD) to be highly correlated with mortality. Results The greedy stacking algorithm provides good ensemble weights and outperforms the linear stacker in many tasks. Still, the brute force approach is slightly superior, but is computationally expensive. The greedy weighting algorithm has a variety of possible applications and is fast and efficient. A python implementation is provided.

Publisher

Springer Science and Business Media LLC

Subject

General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

http://link.springer.com/content/pdf/10.1186/s13104-020-4931-7.pdf

Reference24 articles.

1. Wolpert DH. Stacked generalization. Neural Netw. 1992;5(2):241–59.

2. Breiman L. Stacked regressions. Mach Learn. 1996;24(1):49–64.

3. Van der Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol. 2007;6(1):7.

4. Rose S. Mortality risk score prediction in an elderly population using machine learning. Am J Epidemiol. 2013;177(5):443–52.

5. Sikora R, Hmoud Al-laymoun O. A modified stacking ensemble machine learning algorithm using genetic algorithms. J Int Tech Inform Manag. 2014;23(1):1.

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identifying potential (re)hemorrhage among sporadic cerebral cavernous malformations using machine learning;Scientific Reports;2024-05-14

2. Greedy Weighted Stacking of Machine Learning Models for Optimizing Dam Deformation Prediction;Water;2024-04-25

3. Optimizing Ensemble Learning to Reduce Misclassification Costs in Credit Risk Scorecards;Mathematics;2024-03-14

4. An Experimental Comparison on Machine Learning Ensemble Stacking‐Based Air Quality Prediction System;Artificial Intelligence for Sustainable Applications;2023-09-05

5. Shear capacity prediction for reinforced concrete deep beams with web openings using artificial intelligence methods;Engineering Structures;2023-04