Author:
Chen Haoran,Yang Fengchun,Duan Yifan,Yang Lin,Li Jiao
Abstract
Abstract
Background
This study aimed to develop a higher performance nomogram based on explainable machine learning methods, and to predict the risk of death of stroke patients within 30 days based on clinical characteristics on the first day of intensive care units (ICU) admission.
Methods
Data relating to stroke patients were extracted from the Medical Information Marketplace of the Intensive Care (MIMIC) IV and III database. The LightGBM machine learning approach together with Shapely additive explanations (termed as explain machine learning, EML) was used to select clinical features and define cut-off points for the selected features. These selected features and cut-off points were then evaluated using the Cox proportional hazards regression model and Kaplan-Meier survival curves. Finally, logistic regression-based nomograms for predicting 30-day mortality of stroke patients were constructed using original variables and variables dichotomized by cut-off points, respectively. The performance of two nomograms were evaluated in overall and individual dimension.
Results
A total of 2982 stroke patients and 64 clinical features were included, and the 30-day mortality rate was 23.6% in the MIMIC-IV datasets. 10 variables (“sofa (sepsis-related organ failure assessment)”, “minimum glucose”, “maximum sodium”, “age”, “mean spo2 (blood oxygen saturation)”, “maximum temperature”, “maximum heart rate”, “minimum bun (blood urea nitrogen)”, “minimum wbc (white blood cells)” and “charlson comorbidity index”) and respective cut-off points were defined from the EML. In the Cox proportional hazards regression model (Cox regression) and Kaplan-Meier survival curves, after grouping stroke patients according to the cut-off point of each variable, patients belonging to the high-risk subgroup were associated with higher 30-day mortality than those in the low-risk subgroup. The evaluation of nomograms found that the EML-based nomogram not only outperformed the conventional nomogram in NIR (net reclassification index), brier score and clinical net benefits in overall dimension, but also significant improved in individual dimension especially for low “maximum temperature” patients.
Conclusions
The 10 selected first-day ICU admission clinical features require greater attention for stroke patients. And the nomogram based on explainable machine learning will have greater clinical application.
Funder
Beijing Natural Science Foundation
the CAMS Innovation Fund for Medical Sciences
the Program of Chinese Academy of Medical Sciences
Publisher
Springer Science and Business Media LLC