Machine Learning for Predicting Micro- and Macrovascular Complications in Individuals With Prediabetes or Diabetes: Retrospective Cohort Study (Preprint)

Author:

Schallmoser SimonORCID,Zueger ThomasORCID,Kraus MathiasORCID,Saar-Tsechansky MaytalORCID,Stettler ChristophORCID,Feuerriegel StefanORCID

Abstract

BACKGROUND

Micro- and macrovascular complications are a major burden for individuals with diabetes and can already arise in a prediabetic state. To allocate effective treatments and to possibly prevent these complications, identification of those at risk is essential.

OBJECTIVE

This study aimed to build machine learning (ML) models that predict the risk of developing a micro- or macrovascular complication in individuals with prediabetes or diabetes.

METHODS

In this study, we used electronic health records from Israel that contain information about demographics, biomarkers, medications, and disease codes; span from 2003 to 2013; and were queried to identify individuals with prediabetes or diabetes in 2008. Subsequently, we aimed to predict which of these individuals developed a micro- or macrovascular complication within the next 5 years. We included 3 microvascular complications: retinopathy, nephropathy, and neuropathy. In addition, we considered 3 macrovascular complications: peripheral vascular disease (PVD), cerebrovascular disease (CeVD), and cardiovascular disease (CVD). Complications were identified via disease codes, and, for nephropathy, the estimated glomerular filtration rate and albuminuria were considered additionally. Inclusion criteria were complete information on age and sex and on disease codes (or measurements of estimated glomerular filtration rate and albuminuria for nephropathy) until 2013 to account for patient dropout. Exclusion criteria for predicting a complication were diagnosis of this specific complication before or in 2008. In total, 105 predictors from demographics, biomarkers, medications, and disease codes were used to build the ML models. We compared 2 ML models: logistic regression and gradient-boosted decision trees (GBDTs). To explain the predictions of the GBDTs, we calculated Shapley additive explanations values.

RESULTS

Overall, 13,904 and 4259 individuals with prediabetes and diabetes, respectively, were identified in our underlying data set. For individuals with prediabetes, the areas under the receiver operating characteristic curve for logistic regression and GBDTs were, respectively, 0.657 and 0.681 (retinopathy), 0.807 and 0.815 (nephropathy), 0.727 and 0.706 (neuropathy), 0.730 and 0.727 (PVD), 0.687 and 0.693 (CeVD), and 0.707 and 0.705 (CVD); for individuals with diabetes, the areas under the receiver operating characteristic curve were, respectively, 0.673 and 0.726 (retinopathy), 0.763 and 0.775 (nephropathy), 0.745 and 0.771 (neuropathy), 0.698 and 0.715 (PVD), 0.651 and 0.646 (CeVD), and 0.686 and 0.680 (CVD). Overall, the prediction performance is comparable for logistic regression and GBDTs. The Shapley additive explanations values showed that increased levels of blood glucose, glycated hemoglobin, and serum creatinine are risk factors for microvascular complications. Age and hypertension were associated with an elevated risk for macrovascular complications.

CONCLUSIONS

Our ML models allow for an identification of individuals with prediabetes or diabetes who are at increased risk of developing micro- or macrovascular complications. The prediction performance varied across complications and target populations but was in an acceptable range for most prediction tasks.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3