Detecting High-Risk Factors and Early Diagnosis of Diabetes Using Machine Learning Methods

Author:

Ullah Zahid1ORCID,Saleem Farrukh1ORCID,Jamjoom Mona2ORCID,Fakieh Bahjat1ORCID,Kateb Faris3ORCID,Ali Abdullah Marish4ORCID,Shah Babar5ORCID

Affiliation:

1. Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia

2. Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia

3. Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia

4. Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia

5. College of Technological Innovation, Zayed University, Abu Dhabi, UAE

Abstract

Diabetes is a chronic disease that can cause several forms of chronic damage to the human body, including heart problems, kidney failure, depression, eye damage, and nerve damage. There are several risk factors involved in causing this disease, with some of the most common being obesity, age, insulin resistance, and hypertension. Therefore, early detection of these risk factors is vital in helping patients reverse diabetes from the early stage to live healthy lives. Machine learning (ML) is a useful tool that can easily detect diabetes from several risk factors and, based on the findings, provide a decision-based model that can help in diagnosing the disease. This study aims to detect the risk factors of diabetes using ML methods and to provide a decision support system for medical practitioners that can help them in diagnosing diabetes. Moreover, besides various other preprocessing steps, this study has used the synthetic minority over-sampling technique integrated with the edited nearest neighbor (SMOTE-ENN) method for balancing the BRFSS dataset. The SMOTE-ENN is a more powerful method than the individual SMOTE method. Several ML methods were applied to the processed BRFSS dataset and built prediction models for detecting the risk factors that can help in diagnosing diabetes patients in the early stage. The prediction models were evaluated using various measures that show the high performance of the models. The experimental results show the reliability of the proposed models, demonstrating that k-nearest neighbor (KNN) outperformed other methods with an accuracy of 98.38%, sensitivity, specificity, and ROC/AUC score of 98%. Moreover, compared with the existing state-of-the-art methods, the results confirm the efficacy of the proposed models in terms of accuracy and other evaluation measures. The use of SMOTE-ENN is more beneficial for balancing the dataset to build more accurate prediction models. This was the main reason it was possible to achieve models more accurate than the existing ones.

Funder

Institutional Fund Project

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3