BACKGROUND
The occurrence of delirium in hospitalized patients is a worldwide problem, not only because of the burden it places on healthcare professionals, but also because of its impact on patient prognosis. Therefore, although there has been much research on the use of machine learning to predict the occurrence of delirium in advance, there are few cases where the results have been applied to clinical practice. Explainable artificial intelligence (XAI) techniques are being increasingly adopted worldwide in recent years because these models present not only the results of AI predictions but also processes that guided decisions.
OBJECTIVE
In this study, we focused on visualizing the predictors of delirium using the XAI method and implementing analysis results in clinical practice.
METHODS
Retrospective data of 55389 patients hospitalized in a single acute care center in Japan between December 2017 and February 2022 were collected. Patients were categorized into two analysis populations according to inclusion and exclusion criteria to develop delirium predictive models. First, the predictive performance of four machine learning algorithms (ridge regression [RIDGE], least absolute shrinkage and selection operator regression [LASSO], random forest [RF], and eXtreme gradient boosting [XGBoost]), in delirium prediction were investigated. The predictors were then visualized using Shapley additive explanation (SHAP) and incorporated into clinical practice.
RESULTS
The median age (interquartile range) of 55389 patients was 73.0 (64.0–82.0) years, and 33829 - which is approximately 61.1% of the patients - were men. The area under the receiver operating characteristic curves of machine learning-based prediction of delirium in each population ranged from 0.795 to 0.852 and exhibited excellent discriminative performance. Similarly, the calibration slope was from 0.692 to 1.284 with excellent calibration for all machine learning models except RF. SHAP was used to visualize body mass index and albumin values as critical predictive contributors to delirium. These factors were previously not considered for calculating risk scores at the acute care hospital in this study but were subsequently added to the new risk score. The cut-off value for age, which was previously unknown, was visualized, and the risk threshold for age was increased. These revisions could reduce false positive and false negative rates by 4.66% and 0.02%, respectively.
CONCLUSIONS
A data-driven system was developed to perform delirium prediction. Thus, XAI techniques could be used for developing a learning health system. However, machine learning predictive models have not yet been implemented in electronic systems such as electronic medical records. In order to implement predictive models in the future, it is necessary to evaluate the reliability and robustness of the models.
CLINICALTRIAL
None, as this is an observational study.