Author:
Han Yu-Mei, ,Yang Hui,Huang Qin-Lai,Sun Zi-Jie,Li Ming-Liang,Zhang Jing-Bo,Deng Ke-Jun,Chen Shuo,Lin Hao,
Abstract
<abstract>
<p>Diabetes is a metabolic disorder caused by insufficient insulin secretion and insulin secretion disorders. From health to diabetes, there are generally three stages: health, pre-diabetes and type 2 diabetes. Early diagnosis of diabetes is the most effective way to prevent and control diabetes and its complications. In this work, we collected the physical examination data from Beijing Physical Examination Center from January 2006 to December 2017, and divided the population into three groups according to the WHO (1999) Diabetes Diagnostic Standards: normal fasting plasma glucose (NFG) (FPG < 6.1 mmol/L), mildly impaired fasting plasma glucose (IFG) (6.1 mmol/L ≤ FPG < 7.0 mmol/L) and type 2 diabetes (T2DM) (FPG > 7.0 mmol/L). Finally, we obtained1,221,598 NFG samples, 285,965 IFG samples and 387,076 T2DM samples, with a total of 15 physical examination indexes. Furthermore, taking eXtreme Gradient Boosting (XGBoost), random forest (RF), Logistic Regression (LR), and Fully connected neural network (FCN) as classifiers, four models were constructed to distinguish NFG, IFG and T2DM. The comparison results show that XGBoost has the best performance, with AUC (macro) of 0.7874 and AUC (micro) of 0.8633. In addition, based on the XGBoost classifier, three binary classification models were also established to discriminate NFG from IFG, NFG from T2DM, IFG from T2DM. On the independent dataset, the AUCs were 0.7808, 0.8687, 0.7067, respectively. Finally, we analyzed the importance of the features and identified the risk factors associated with diabetes.</p>
</abstract>
Publisher
American Institute of Mathematical Sciences (AIMS)
Subject
Applied Mathematics,Computational Mathematics,General Agricultural and Biological Sciences,Modeling and Simulation,General Medicine
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献