Abstract
Stability of feature selection algorithm refers to its robustness to the perturbations of the training set, parameter settings or initialization. A stable feature selection algorithm is crucial for identifying the relevant feature subset of meaningful and interpretable features which is extremely important in the task of knowledge discovery. Though there are many stability measures reported in the literature for evaluating the stability of feature selection, none of them follows all the requisite properties of a stability measure. Among them, the Kuncheva index and its modifications, are widely used in practical problems. In this work, the merits and limitations of the Kuncheva index and its existing modifications (Lustgarten, Wald, nPOG/nPOGR, Nogueira) are studied and analysed with respect to the requisite properties of stability measure. One more limitation of the most recent modified similarity measure, Nogueira’s measure, has been pointed out. Finally, corrections to Lustgarten’s measure have been proposed to define a new modified stability measure that satisfies the desired properties and overcomes the limitations of existing popular similarity based stability measures. The effectiveness of the newly modified Lustgarten’s measure has been evaluated with simple toy experiments.
Funder
Japan Society for the Promotion of Science
Subject
General Economics, Econometrics and Finance
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Stability of filter feature selection methods in data pipelines: a simulation study;International Journal of Data Science and Analytics;2022-12-14
2. Performance Analysis of Extended Lustgarten Index for Stability of Feature Selection;2021 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI);2021-12-11