Affiliation:
1. Department of Electronics and Communications, University of Allahabad, Prayagraj, Allahabad 211002, Uttar Pradesh, India
Abstract
Diabetes is a chronic disease that affects millions of people worldwide. Accurate and timely diagnosis of diabetes is crucial for its effective treatment and management. While machine learning has shown promise in predicting the disease, missing data, outliers, class imbalance and limitations of classifiers can hinder accuracy. To address these challenges, we propose a novel machine learning approach that combines adaptive iterative imputation (AII) for missing value imputation, dynamic ensemble isolation forest (DE-IF) for outlier detection and removal, Iterated KMeans SMOTEENN (IKMSENN) for class imbalance, and an adaptive extra tree classifier (AETC) for classification. Our approach is evaluated using the Pima Indian Diabetes Dataset (PIDD), a widely used benchmark dataset in diabetes disease prediction. Experimental results show that our approach outperforms several state-of-the-art machine learning models in terms of accuracy, precision, recall, [Formula: see text]-measure, and the area under the receiver operating characteristic (ROC) curve (AUC-ROC). Our approach achieved an accuracy of 98.58%, with a precision of 0.986, recall of 0.987, [Formula: see text]-measure of 0.985, and ROC of 0.965 on the PIDD dataset. Our research presents a significant contribution to the field of diabetes disease prediction by introducing novel machine learning approaches that address common challenges such as missing data, outliers and class imbalance, as well as limitations of classifiers. Our approach has the potential to greatly improve the accuracy and effectiveness of diabetes disease prediction and has important implications for the diagnosis and management of the disease.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Condensed Matter Physics,General Materials Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献