Interpretable prediction for in-hospital adverse stroke events with feature engineering (Preprint)-Reference-Cited by-同舟云学术

Interpretable prediction for in-hospital adverse stroke events with feature engineering (Preprint)

Published:2021-03-09 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Zhang Shuo,Li Runzhi,Wang Nan,Song Bo,Dai Honghua,Xu Yuming

Abstract

BACKGROUND

Stroke is the leading cause of death and disability. And the burden of stroke is rapidly increasing worldwide. In-hospital adverse stroke events prediction help significantly to improve the precise risk stratification and management of patients.

OBJECTIVE

To interpretably predict the in-hospital adverse events in patients with ischemic stroke, we design a novel feature engineering equipped with machine learning methods. It can optimize dataset, select features and eliminate bias. Especially, we further explore the interpretability of prediction model.

METHODS

We use the feature engineering to retrospectively analyze 3-years registered ischemic stroke patient dataset by the process: data preprocessing, feature selection and feature weighting. Then we construct prediction models based on common machine learning methods and analyze the interpretability from the contribution of all features by SHAP (global perspective) and visual analysis of PDP (local perspective) respectively. Among the prediction models, we use multiple candidate predictors including demographic characteristics, medical history, stroke severity, medication history and clinical measurement indicators. The evaluation metrics of the model include: Specificity, Sensitivity, Area Under the Cure(AUC).

RESULTS

The experimental subject is 2310 eligible patients. Feature engineering can effectively optimize dataset, select more valuable features and eliminate statistical and social biases. Finally, we obtain the best experiment performance based on XGB, the AUC of XGB is 0.784 on the original dataset. We also give some experiment results after the feature selection, e.g., the AUC of XGB is 0.747 for dataset selected by the top 15 features, the specificity and the sensitivity is 0.789 and 0.705. The top 4 features, consisting of age, NIHSS score, atrial fibrillation and ability to walk within 48 hours of admission, are sorted according to the SHAP value. We can also clearly observe how these above 4 features affect the prediction results of the model through PDP.

CONCLUSIONS

The designation of feature engineering enables machine learning methods to achieve more better and objective effect considering comprehensive decisions based on specific research conditions. In addition, the exploration of interpretability will extend the application prospect of machine learning methods.

Publisher

JMIR Publications Inc.

Reference26 articles.

1. Faculty Opinions recommendation of Global, regional, and national disability-adjusted life-years (DALYs) for 315 diseases and injuries and healthy life expectancy (HALE), 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015.

2. Global, Regional, and Country-Specific Lifetime Risks of Stroke, 1990 and 2016

3. China Stroke Statistics 2019: A Report From the National Center for Healthcare Quality Management in Neurological Diseases, China National Clinical Research Center for Neurological Diseases, the Chinese Stroke Association, National Center for Chronic and Non-communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention and Institute for Global Neuroscience and Stroke Collaborations

4. Addressing Bias in Artificial Intelligence in Health Care

5. Management characteristics and prognosis after stroke in China: findings from a large nationwide stroke registry