Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type-Reference-Cited by-同舟云学术

Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type

Published:2022-11-15 Issue:22 Volume:19 Page:15027
ISSN:1660-4601
Container-title:International Journal of Environmental Research and Public Health
language:en
Short-container-title:IJERPH

Author:

Qin Yifan,Wu Jinlong^ORCID,Xiao Wen,Wang Kun,Huang Anbing,Liu Bowen,Yu Jingxuan,Li Chuhao,Yu Fengyu,Ren Zhanbing^ORCID

Abstract

The prevalence of diabetes has been increasing in recent years, and previous research has found that machine-learning models are good diabetes prediction tools. The purpose of this study was to compare the efficacy of five different machine-learning models for diabetes prediction using lifestyle data from the National Health and Nutrition Examination Survey (NHANES) database. The 1999–2020 NHANES database yielded data on 17,833 individuals data based on demographic characteristics and lifestyle-related variables. To screen training data for machine models, the Akaike Information Criterion (AIC) forward propagation algorithm was utilized. For predicting diabetes, five machine-learning models (CATBoost, XGBoost, Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM)) were developed. Model performance was evaluated using accuracy, sensitivity, specificity, precision, F1 score, and receiver operating characteristic (ROC) curve. Among the five machine-learning models, the dietary intake levels of energy, carbohydrate, and fat, contributed the most to the prediction of diabetes patients. In terms of model performance, CATBoost ranks higher than RF, LG, XGBoost, and SVM. The best-performing machine-learning model among the five is CATBoost, which achieves an accuracy of 82.1% and an AUC of 0.83. Machine-learning models based on NHANES data can assist medical institutions in identifying diabetes patients.

Funder

National Natural Science Foundation of China

Research Foundation for Young Teacher of Shenzhen University

High-level Scientific Research Foundation for the Introduction of Talent of Shenzhen University

Natural Science Featured Innovation Projects in Ordinary Universities in Guangdong Province

Scientific Research Platform and Project of Colleges and Universities of Education Department of Guangdong Province

Publisher

MDPI AG

Subject

Health, Toxicology and Mutagenesis,Public Health, Environmental and Occupational Health

Link

https://www.mdpi.com/1660-4601/19/22/15027/pdf

Reference57 articles.

1. (2022, September 01). International Diabetes Federation. Available online: https://diabetesatlas.org/.

2. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): Case-control study;Lancet,2004

3. Diabetic Kidney Disease: Challenges, Progress, and Possibilities;Clin. J. Am. Soc. Nephrol.,2017

4. Diabetic retinopathy—Ocular complications of diabetes mellitus;World J. Diabetes,2015

5. Diabetic foot disease: From the evaluation of the “foot at risk” to the novel diabetic ulcer treatment modalities;World J. Diabetes,2016

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Utilizing Machine Learning to Assess the Impact of Attitudinal, Knowledge, and Perceptual Factors on Diabetes Awareness;2024-08-26

2. Diabetes Prediction Using Machine Learning;2024 9th International Conference on Machine Learning Technologies (ICMLT);2024-05-24

3. SHAP-Guided GWO and ABC Feature Selection for Early Diabetes Prediction with XGBoost;2024 2nd International Conference on Advancement in Computation & Computer Technologies (InCACCT);2024-05-02

4. A comparative evaluation of machine learning ensemble approaches for disease prediction using multiple datasets;Health and Technology;2024-03-27

5. Training for endovascular therapy of acute arterial disease and procedure-related complication: An extracorporeally-perfused human cadaver model study;PLOS ONE;2024-02-08