Author:
Chen Si-Ding,You Jia,Yang Xiao-Meng,Gu Hong-Qiu,Huang Xin-Ying,Liu Huan,Feng Jian-Feng,Jiang Yong,Wang Yong-jun
Abstract
Abstract
Objective
We aimed to investigate factors related to the 90-day poor prognosis (mRS≥3) in patients with transient ischemic attack (TIA) or minor stroke, construct 90-day poor prognosis prediction models for patients with TIA or minor stroke, and compare the predictive performance of machine learning models and Logistic model.
Method
We selected TIA and minor stroke patients from a prospective registry study (CNSR-III). Demographic characteristics,smoking history, drinking history(≥20g/day), physiological data, medical history,secondary prevention treatment, in-hospital evaluation and education,laboratory data, neurological severity, mRS score and TOAST classification of patients were assessed. Univariate and multivariate logistic regression analyses were performed in the training set to identify predictors associated with poor outcome (mRS≥3). The predictors were used to establish machine learning models and the traditional Logistic model, which were randomly divided into the training set and test set according to the ratio of 70:30. The training set was used to construct the prediction model, and the test set was used to evaluate the effect of the model. The evaluation indicators of the model included the area under the curve (AUC) of the discrimination index and the Brier score (or calibration plot) of the calibration index.
Result
A total of 10967 patients with TIA and minor stroke were enrolled in this study, with an average age of 61.77 ± 11.18 years, and women accounted for 30.68%. Factors associated with the poor prognosis in TIA and minor stroke patients included sex, age, stroke history, heart rate, D-dimer, creatinine, TOAST classification, admission mRS, discharge mRS, and discharge NIHSS score. All models, both those constructed by Logistic regression and those by machine learning, performed well in predicting the 90-day poor prognosis (AUC >0.800). The best performing AUC in the test set was the Catboost model (AUC=0.839), followed by the XGBoost, GBDT, random forest and Adaboost model (AUCs equal to 0.838, 0, 835, 0.832, 0.823, respectively). The performance of Catboost and XGBoost in predicting poor prognosis at 90-day was better than the Logistic model, and the difference was statistically significant(P<0.05). All models, both those constructed by Logistic regression and those by machine learning had good calibration.
Conclusion
Machine learning algorithms were not inferior to the Logistic regression model in predicting the poor prognosis of patients with TIA and minor stroke at 90-day. Among them, the Catboost model had the best predictive performance. All models provided good discrimination.
Publisher
Springer Science and Business Media LLC
Subject
Health Informatics,Epidemiology