Machine learning-based reproducible prediction of type 2 diabetes subtypes

Author:

Tanabe HayatoORCID,Sato Masahiro,Miyake AkimitsuORCID,Shimajiri Yoshinori,Ojima TakafumiORCID,Narita AkiraORCID,Saito Haruka,Tanaka KenichiORCID,Masuzaki HiroakiORCID,Kazama Junichiro J.ORCID,Katagiri HidekiORCID,Tamiya GenORCID,Kawakami EiryoORCID,Shimabukuro MichioORCID

Abstract

Abstract Aims/hypothesis Clustering-based subclassification of type 2 diabetes, which reflects pathophysiology and genetic predisposition, is a promising approach for providing personalised and effective therapeutic strategies. Ahlqvist’s classification is currently the most vigorously validated method because of its superior ability to predict diabetes complications but it does not have strong consistency over time and requires HOMA2 indices, which are not routinely available in clinical practice and standard cohort studies. We developed a machine learning (ML) model to classify individuals with type 2 diabetes into Ahlqvist’s subtypes consistently over time. Methods Cohort 1 dataset comprised 619 Japanese individuals with type 2 diabetes who were divided into training and test sets for ML models in a 7:3 ratio. Cohort 2 dataset, comprising 597 individuals with type 2 diabetes, was used for external validation. Participants were pre-labelled (T2Dkmeans) by unsupervised k-means clustering based on Ahlqvist’s variables (age at diagnosis, BMI, HbA1c, HOMA2-B and HOMA2-IR) to four subtypes: severe insulin-deficient diabetes (SIDD), severe insulin-resistant diabetes (SIRD), mild obesity-related diabetes (MOD) and mild age-related diabetes (MARD). We adopted 15 variables for a multiclass classification random forest (RF) algorithm to predict type 2 diabetes subtypes (T2DRF15). The proximity matrix computed by RF was visualised using a uniform manifold approximation and projection. Finally, we used a putative subset with missing insulin-related variables to test the predictive performance of the validation cohort, consistency of subtypes over time and prediction ability of diabetes complications. Results T2DRF15 demonstrated a 94% accuracy for predicting T2Dkmeans type 2 diabetes subtypes (AUCs ≥0.99 and F1 score [an indicator calculated by harmonic mean from precision and recall] ≥0.9) and retained the predictive performance in the external validation cohort (86.3%). T2DRF15 showed an accuracy of 82.9% for detecting T2Dkmeans, also in a putative subset with missing insulin-related variables, when used with an imputation algorithm. In Kaplan–Meier analysis, the diabetes clusters of T2DRF15 demonstrated distinct accumulation risks of diabetic retinopathy in SIDD and that of chronic kidney disease in SIRD during a median observation period of 11.6 (4.5–18.3) years, similarly to the subtypes using T2Dkmeans. The predictive accuracy was improved after excluding individuals with low predictive probability, who were categorised as an ‘undecidable’ cluster. T2DRF15, after excluding undecidable individuals, showed higher consistency (100% for SIDD, 68.6% for SIRD, 94.4% for MOD and 97.9% for MARD) than T2Dkmeans. Conclusions/interpretation The new ML model for predicting Ahlqvist’s subtypes of type 2 diabetes has great potential for application in clinical practice and cohort studies because it can classify individuals with missing HOMA2 indices and predict glycaemic control, diabetic complications and treatment outcomes with long-term consistency by using readily available variables. Future studies are needed to assess whether our approach is applicable to research and/or clinical practice in multiethnic populations. Graphical Abstract

Funder

Japan Society for the Promotion of Science

Japan Science and Technology Agency

Publisher

Springer Science and Business Media LLC

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3