BACKGROUND
Stroke is the leading cause of death and disability. And the burden of stroke is rapidly increasing worldwide. In-hospital adverse stroke events prediction help significantly to improve the precise risk stratification and management of patients.
OBJECTIVE
To interpretably predict the in-hospital adverse events in patients with ischemic stroke, we design a novel feature engineering equipped with machine learning methods. It can optimize dataset, select features and eliminate bias. Especially, we further explore the interpretability of prediction model.
METHODS
We use the feature engineering to retrospectively analyze 3-years registered ischemic stroke patient dataset by the process: data preprocessing, feature selection and feature weighting. Then we construct prediction models based on common machine learning methods and analyze the interpretability from the contribution of all features by SHAP (global perspective) and visual analysis of PDP (local perspective) respectively. Among the prediction models, we use multiple candidate predictors including demographic characteristics, medical history, stroke severity, medication history and clinical measurement indicators. The evaluation metrics of the model include: Specificity, Sensitivity, Area Under the Cure(AUC).
RESULTS
The experimental subject is 2310 eligible patients. Feature engineering can effectively optimize dataset, select more valuable features and eliminate statistical and social biases. Finally, we obtain the best experiment performance based on XGB, the AUC of XGB is 0.784 on the original dataset. We also give some experiment results after the feature selection, e.g., the AUC of XGB is 0.747 for dataset selected by the top 15 features, the specificity and the sensitivity is 0.789 and 0.705. The top 4 features, consisting of age, NIHSS score, atrial fibrillation and ability to walk within 48 hours of admission, are sorted according to the SHAP value. We can also clearly observe how these above 4 features affect the prediction results of the model through PDP.
CONCLUSIONS
The designation of feature engineering enables machine learning methods to achieve more better and objective effect considering comprehensive decisions based on specific research conditions. In addition, the exploration of interpretability will extend the application prospect of machine learning methods.