Leveraging Feature Sensitivity and Relevance: A Hybrid Feature Selection Approach for Improved Model Performance in Supervised Classification

Author:

Saranya G1,Rajendran Rakesh2,Jaganathan Subash Chandra Bose3,Pandimurugan V1

Affiliation:

1. SRM Institute of Science and Technology

2. Regenesys Institute of Management

3. VIT Bhopal University

Abstract

Abstract

Many feature selection algorithms primarily give importance to identifying relevant features and eliminating redundant features. This hybrid work determines the significant features, based on the estimated individual feature sensitivities and the degree of relevance between the feature and target outcome. The majority of works currently in existence employ mutual information (MI) to calculate the degree of information between two variables. By scaling the range of the MI to [0,1], Symmetrical Uncertainty (SU) can be viewed as the normalized MI. In this proposed work, Symmetrical Uncertainty-Relevance (SU-R) is used to measure the relevance between each feature and the target outcome. Per Feature Sensitivity Analysis (PFS) is used to measure the individual feature sensitivity with the target outcome. Features are ranked based on the sum of the ranks of features calculated individually using Symmetrical Uncertainty-Relevance (SU-R) and Per Feature Sensitivity analysis (PFS). Less significant features are iteratively eliminated starting from discarding the least ranked feature identified using the combination of SU-R and PFS Analysis.To evaluate how well our proposed method identifies important features, we assess the influence of each feature on the model's performance using metrics like F1 score and accuracy. This evaluation is conducted on two diverse public datasets from the UCI Machine Learning repository, allowing us to assess the method's robustness across different data types.This hybrid work identified the best 450 significant features out of 754 in the Parkinson’s disease dataset, and the top 150 features out of 562 in the smart phone dataset. The efficacy of the SVM classifier with the selected number of significant features with the proposed hybrid PF and SU-R technique outperforms the SVM when applied with existing feature selection methods.

Publisher

Research Square Platform LLC

Reference46 articles.

1. Liu H, Motoda H, Yu L (2004) Selective sampling approach to active feature selection, Artif. Intell., vol. 159, nos. 1–2, pp. 49–72, Nov

2. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) Knowl Inf Syst 34(3):483–519A review of feature selection methods on synthetic data,

3. Fayyad U (2001) Knowledge discovery in databases: An overview, Relational data mining, pp. 28–47

4. Tang J, Alelyani. S and, Liu H (2014) Feature selection for classification: A review, Data classification: Algorithms and applications, p. 37

5. Feature selection on node statistics-based embedding of graphs;Gibert J;Pattern Recognit Lett,2012

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3