Affiliation:
1. Kaohsiung Armed Forces General Hospital
2. Fu Jen Catholic University
3. Taipei Medical University
Abstract
Abstract
The prevalence of type 2 diabetes (T2D) has been increasing drastically in recent decades. In the same time, it has been noted that dementia is related to T2D. In the past, traditional multiple linear regression (MLR) is the most commonly used method in analyzing these kinds of relationships. However, machine learning methods (Mach-L) have been emerged recently. These methods could capture non-linear relationships better than the MLR. In the present study, we enrolled old T2D and used four different Mach-L methods to analyze the relationships between risk factors and cognitive function. Our goals were first, to compare the accuracy between MLR and Mach-L in predicting cognitive function and second, to rank importance of the risks for impaired cognitive function in T2D.
There were 197 old T2D enrolled (98 men and 99 women). Demographic and biochemistry data were used as independent variables and the cognitive function assessment (CFA) score was measured by Montreal Cognitive Assessment which was regarded as independent variable. In addition to traditional MLR, random forest (RF), stochastic gradient boosting (SGB), Naïve Byer’s classifier (NB) and eXtreme gradient boosting (XGBoost) were also applied.
Our results showed that all the RF, SGB, NB and XGBoost outperformed than the MLR. Education level, age, frailty score, fasting plasma glucose and body mass index were identified as the important factors from the more to the less important.
In conclusion, our study demonstrated that RF, SGB, NB and XGBoost are more accurate than the MLR and in predicting CFA score. By these methods, the importance ranks of the risk factors are education level, age, frailty score, fasting plasma glucose and body mass index accordingly in a Chinese T2D cohort.
Publisher
Research Square Platform LLC