Affiliation:
1. Department of Computer Science, Vel Tech Rangarajan Dr. Sagunthala R & D, Institute of Science and Technology, Chennai, Tamil Nadu, India
Abstract
In the past decades, there is a wide increase in the number of people affected by diabetes, a chronic illness. Early prediction of diabetes is still a challenging problem as it requires clear and sound datasets for a precise prediction. In this era of ubiquitous information technology, big data helps to collect a large amount of information regarding healthcare systems. Due to explosion in the generation of digital data, selecting appropriate data for analysis still remains a complex task. Moreover, missing values and insignificantly labeled data restrict the prediction accuracy. In this context, with the aim of improving the quality of the dataset, missing values are effectively handled by three major phases such as (1) pre-processing, (2) feature extraction, and (3) classification. Pre-processing involves outlier rejection and filling missing values. Feature extraction is done by a principal component analysis (PCA) and finally, the precise prediction of diabetes is accomplished by implementing an effective distance adaptive-KNN (DA-KNN) classifier. The experiments were conducted using Pima Indian Diabetes (PID) dataset and the performance of the proposed model was compared with the state-of-the-art models. The analysis after implementation shows that the proposed model outperforms the conventional models such as NB, SVM, KNN, and RF in terms of accuracy and ROC.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Computer Graphics and Computer-Aided Design,Computer Science Applications,Computer Vision and Pattern Recognition
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献