Feature subset selection using heuristic and metaheuristic approaches for diabetes prediction on a binary encoded dataset

Author:

Kumar Puneet1ORCID,Bhati Bhoopesh Singh2ORCID,Dhanaraj Rajesh Kumar3ORCID,Iwendi Celestine4ORCID,Balusamy Balamurugan5ORCID,Bhati Nitesh Singh6ORCID,Rai Prerana7ORCID

Affiliation:

1. Department of Computer Science & Engineering, Chandigarh University, Punjab, India

2. Department of Computer Science & Engineering, Indian Institute of Information Technology, Sonepat, India

3. Symbiosis Institute of Computer Studies and Research (SICSR), Symbiosis International (Deemed University), Pune, India

4. School of Creative Technologies, University of Bolton, UK

5. Shiv Nadar University, Delhi-NCR Campus, Noida, India

6. Department of Computer Science & Engineering, Galgotias University, Greater Noida, India

7. Chandigarh University, Punjab, India

Abstract

The Machine Learning (ML) models are prone to a curse of dimensionality. The dataset with a greater number of features involves more computational cost and it may lead to low performance in the context of prediction accuracy. Therefore, in this research work we have predicted diabetes with more accuracy by using a smaller number of features. The heuristic methods Sequential Forward Selection (SFS), Sequential Backward Selection (SBS) and metaheuristic evolutionary methods — Whale Optimization Algorithm (WOA) and Genetic Algorithm (GA) are used for performing feature subset selection. The Gini index is also used as a filter evaluator. The performance of the feature subsets is analyzed by applying three different types of ML models, Random Forest (RF), Multi-Layer Perceptron (MLP) and K-Nearest Neighbor (KNN). We have predicted type-2 diabetes with an accuracy of 96.82%. Also, we have reduced the number of features up to 67.44% i.e., identified 32.56% most relevant features.

Publisher

World Scientific Pub Co Pte Ltd

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3