Improving the Stability of the Variable Selection with Small Datasets in Classification and Regression Tasks-Reference-Cited by-同舟云学术

Improving the Stability of the Variable Selection with Small Datasets in Classification and Regression Tasks

Published:2022-06-10 Issue: Volume: Page:
ISSN:1370-4621
Container-title:Neural Processing Letters
language:en
Short-container-title:Neural Process Lett

Author:

Cateni Silvia,Colla Valentina^ORCID,Vannucci Marco

Abstract

AbstractWithin the design of a machine learning-based solution for classification or regression problems, variable selection techniques are often applied to identify the input variables, which mainly affect the considered target. The selection of such variables provides very interesting advantages, such as lower complexity of the model and of the learning algorithm, reduction of computational time and improvement of performances. Moreover, variable selection is useful to gain a profound knowledge of the considered problem. High correlation in variables often produces multiple subsets of equally optimal variables, which makes the traditional method of variable selection unstable, leading to instability and reducing the confidence of selected variables. Stability identifies the reproducibility power of the variable selection method. Therefore, having a high stability is as important as the high precision of the developed model. The paper presents an automatic procedure for variable selection in classification (binary and multi-class) and regression tasks, which provides an optimal stability index without requiring any a priori information on data. The proposed approach has been tested on different small datasets, which are unstable by nature, and has achieved satisfactory results.

Funder

Scuola Superiore Sant’Anna

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Networks and Communications,General Neuroscience,Software

Link

https://link.springer.com/content/pdf/10.1007/s11063-022-10916-4.pdf

Reference74 articles.

1. Akbari Torkestani J, Meybodi MR (2012) Finding minimum weight connected dominating set in stochastic graph based on learning automata. Inform Sciences 200:57–77. https://doi.org/10.1016/j.ins.2012.02.057

2. Al Janabi KBS, Kadhim R (2018) Data reduction techniques: a comparative study for attribute selection methods. Int J Adv Computer Sci Tech 8(1):1–13

3. Alelyani S (2021) Stable bagging feature selection on medical data. J Big data 8(1):1–18. https://doi.org/10.1186/s40537-020-00385-8

4. Ali S, Smith MK (2006) On learning algorithm selection for classification. Appl Soft Comput 6(2):119–138. https://doi.org/10.1016/j.asoc.2004.12.002

5. Allwein EL, Schapire RE, Singer Y (2001) Reducing multiclass to binary: A unifying approach for margin classifiers. J Mach Learn Res 1(2):113–141

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Feature Selection on Imbalanced Domains: A Stability-Based Analysis;Advances and Trends in Artificial Intelligence. Theory and Applications;2023