Abstract
AbstractMulti-view stacking is a framework for combining information from different views (i.e. different feature sets) describing the same set of objects. In this framework, a base-learner algorithm is trained on each view separately, and their predictions are then combined by a meta-learner algorithm. In a previous study, stacked penalized logistic regression, a special case of multi-view stacking, has been shown to be useful in identifying which views are most important for prediction. In this article we expand this research by considering seven different algorithms to use as the meta-learner, and evaluating their view selection and classification performance in simulations and two applications on real gene-expression data sets. Our results suggest that if both view selection and classification accuracy are important to the research at hand, then the nonnegative lasso, nonnegative adaptive lasso and nonnegative elastic net are suitable meta-learners. Exactly which among these three is to be preferred depends on the research context. The remaining four meta-learners, namely nonnegative ridge regression, nonnegative forward selection, stability selection and the interpolating predictor, show little advantages in order to be preferred over the other three.
Funder
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
Universiteit Leiden
Publisher
Springer Science and Business Media LLC
Reference54 articles.
1. Anagnostopoulos C, Hand DJ (2019) . hmeasure: the H-measure and other scalar classification performance metrics https://CRAN.R-project.org/package=hmeasure R package version 1.0-2
2. Ballings M, Van den Poel D (2013) AUC: threshold independent performance measures for probabilistic classifiers. https://CRAN.R-project.org/package=AUC R package version 0.3.0
3. Benner A, Zucknick M, Hielscher T, Ittrich C, Mansmann U (2010) High-dimensional cox models: the choice of penalty as part of the model building process. Biom J 52(1):50–69. https://doi.org/10.1002/bimj.200900064
4. Bommert A, Sun X, Bischl B, Rahnenführer J, Lang M (2020) Benchmark for filter methods for feature selection in high-dimensional classification data. Comput Stat Data Anal 14(3):106–839. https://doi.org/10.1016/j.csda.2019.106839
5. Breiman L (1996) Stacked regressions. Mach Learn 24:49–64. https://doi.org/10.1007/bf00117832
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献