Abstract
Feature selection has long been a focal point of research in various fields.Recent studies have focused on the application of random multi-subspaces methods to extract more information from raw samples.However,this approach inadequately addresses the adverse effects that may arise due to feature collinearity in high-dimensional datasets.To further address the limited ability of traditional algorithms to extract useful information from raw samples while considering the challenge of feature collinearity during the random subspaces learning process, we employ a clustering approach based on correlation measures to group features.Subsequently, we construct subspaces with lower inter-feature correlations.When integrating feature weights obtained from all feature spaces,we introduce a weighting factor to better handle the contributions from different feature spaces.We comprehensively evaluate our proposed algorithm on ten real datasets and four synthetic datasets,comparing it with six other feature selection algorithms.Experimental results demonstrate that our algorithm,denoted as KNCFS,effectively identifies relevant features,exhibiting robust feature selection performance,particularly suited for addressing feature selection challenges in practice.
Publisher
Public Library of Science (PLoS)
Reference37 articles.
1. On feature learning in the presence of spurious correlations;P. Izmailov;Advances in Neural Information Processing Systems,2022
2. Learning fair representations via rebalancing graph structure;G. Zhang;Information Processing & Management,2024
3. Relief-based feature selection: Introduction and review;R.J. Urbanowicz;Journal of biomedical informatics,2018
4. Distributed multi-label feature selection using individual mutual information measures;J. Gonzalez-Lopez;Knowledge-Based Systems,2020
5. Accelerating wrapper-based feature selection with K-nearest-neighbor;A. Wang;Knowledge-Based Systems,2015