Loss-guided stability selection-Reference-Cited by-同舟云学术

Loss-guided stability selection

Published:2023-12-15 Issue: Volume: Page:
ISSN:1862-5347
Container-title:Advances in Data Analysis and Classification
language:en
Short-container-title:Adv Data Anal Classif

Author:

Werner Tino^ORCID

Abstract

AbstractIn modern data analysis, sparse model selection becomes inevitable once the number of predictor variables is very high. It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data. The celebrated Stability Selection overcomes these weaknesses by aggregating models, based on subsamples of the training data, followed by choosing a stable predictor set which is usually much sparser than the predictor sets from the raw models. The standard Stability Selection is based on a global criterion, namely the per-family error rate, while additionally requiring expert knowledge to suitably configure the hyperparameters. Model selection depends on the loss function, i.e., predictor sets selected w.r.t. some particular loss function differ from those selected w.r.t. some other loss function. Therefore, we propose a Stability Selection variant which respects the chosen loss function via an additional validation step based on out-of-sample validation data, optionally enhanced with an exhaustive search strategy. Our Stability Selection variants are widely applicable and user-friendly. Moreover, our Stability Selection variants can avoid the issue of severe underfitting, which affects the original Stability Selection for noisy high-dimensional data, so our priority is not to avoid false positives at all costs but to result in a sparse stable model with which one can make predictions. Experiments where we consider both regression and binary classification with Boosting as model selection algorithm reveal a significant precision improvement compared to raw Boosting models while not suffering from any of the mentioned issues of the original Stability Selection.

Funder

Carl von Ossietzky Universität Oldenburg

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Statistics and Probability

Link

https://link.springer.com/content/pdf/10.1007/s11634-023-00573-3.pdf

Reference56 articles.

1. Avagyan V, Alonso AM, Nogales FJ (2018) D-trace estimation of a precision matrix using adaptive lasso penalties. Adv Data Anal Classif 12(2):425–447

2. Bach FR (2008) Bolasso: model consistent lasso estimation through the bootstrap. arXiv preprint arXiv:0804.1302

3. Banerjee O, Ghaoui LE, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. J Mach Learn Res 9:485–516

4. based on Fortran code by Alan Miller TL (2020) Leaps: regression subset selection. R package version 3.1. https://CRAN.R-project.org/package=leaps

5. Ben Brahim A, Limam M (2018) Ensemble feature selection for high dimensional data: a new method and a comparative study. Adv Data Anal Classif 12:937–952

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Trimming stability selection increases variable selection robustness;Machine Learning;2023-10-04