A Bootstrap Framework for Aggregating within and between Feature Selection Methods-Reference-Cited by-同舟云学术

A Bootstrap Framework for Aggregating within and between Feature Selection Methods

Published:2021-02-06 Issue:2 Volume:23 Page:200
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Salman Reem,Alzaatreh Ayman,Sulieman Hana,Faisal Shaimaa

Abstract

In the past decade, big data has become increasingly prevalent in a large number of applications. As a result, datasets suffering from noise and redundancy issues have necessitated the use of feature selection across multiple domains. However, a common concern in feature selection is that different approaches can give very different results when applied to similar datasets. Aggregating the results of different selection methods helps to resolve this concern and control the diversity of selected feature subsets. In this work, we implemented a general framework for the ensemble of multiple feature selection methods. Based on diversified datasets generated from the original set of observations, we aggregated the importance scores generated by multiple feature selection techniques using two methods: the Within Aggregation Method (WAM), which refers to aggregating importance scores within a single feature selection; and the Between Aggregation Method (BAM), which refers to aggregating importance scores between multiple feature selection methods. We applied the proposed framework on 13 real datasets with diverse performances and characteristics. The experimental evaluation showed that WAM provides an effective tool for determining the best feature selection method for a given dataset. WAM has also shown greater stability than BAM in terms of identifying important features. The computational demands of the two methods appeared to be comparable. The results of this work suggest that by applying both WAM and BAM, practitioners can gain a deeper understanding of the feature selection process.

Funder

American University of Sharjah

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/23/2/200/pdf

Reference31 articles.

1. A Supervised Feature Selection Approach Based on Global Sensitivity;Sulieman;Arch. Data Sci. Ser. A (Online First),2018

2. Integer programming models for feature selection: New extensions and a randomized solution algorithm

3. Review and evaluation of feature selection algorithms in synthetic problems;González-Navarro;CORR,2011

4. Data mining feature selection for credit scoring models

5. Metalearning: a survey of trends and technologies

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

2. Identification and visualisation of zombie firms using self-organizing maps;Annals of Operations Research;2024-08-15

3. Multimodal Machine Learning-Based Ductal Carcinoma in situ Prediction from Breast Fibromatosis;Cancer Management and Research;2024-07

4. Flood susceptibility mapping through geoinformatics and ensemble learning methods, with an emphasis on the AdaBoost-Decision Tree algorithm, in Mazandaran, Iran;Earth Science Informatics;2024-01-15

5. Feature selection of the respiratory microbiota associated with asthma;Journal of Big Data;2023-06-01