BACKGROUND
Fall-related injuries (FRIs) in hospitalized patients are characterized by a high incidence rate, significant harm, and substantial medical burden. Machine and deep learning have become increasingly popular for developing models for tasks such as FRI risk prediction. However, recent studies typically utilized structured data for predictive modeling, often neglecting potentially valuable unstructured clinical notes within electronic medical record systems (EMRSs) and adverse event reporting systems (AERSs). Prediction model performance may be improved by heterogeneous data (e.g., structured and textual data) integration.
OBJECTIVE
To utilize clinical big data from longitudinal EMRSs and AERSs to assess FRI risk, this study develop multimodal risk prediction models for varying degrees of FRIs by leveraging an automated machine learning (AutoML) tool, the AutoGluon-Tabular framework, enhanced by adapted multimodal approaches. Additionally, the contribution of risk factors, including those found among textual data, in predicting FRIs in hospitalized patients was assessed.
METHODS
FRI data were retrospectively collected from three Chinese tertiary-level general hospitals over a >10-year period. One hospital was selected as the derivation cohort, and the other two hospitals served as independent validation cohorts. These hospitals utilized EMRSs, hospital information systems, and AERSs. The derivation cohort involved 33 departments reporting 1,724 FRIs, and the validation cohort involved 32 departments reporting 340 FRIs. AutoGluon-Tabular was initially employed for structured data analysis. Subsequently, multimodal approaches were integrated to incorporate both structured and unstructured information.
RESULTS
The accuracy, precision, recall, F1 score, and area under the receiver-operating characteristic curve (AUROC) of the model developed using AutoGluon-Tabular for analyzing structured data were 0.676±0.024, 0.663±0.066, 0.676±0.024, 0.622±0.026, and 0.7±0.049, respectively. Integration of AutoGluon-Tabular with multimodal techniques enhanced model performance by integrating textual data analysis. Moreover, external verification in two Chinese tertiary-level hospitals indicated that model accuracy, precision, recall, and AUROC were 0.679, 0.642, 0.679, and 0.664, respectively, which were similar to the derivation cohort results. The F1 score (0.654) was greater than that of the derivation cohort (0.611±0.057), indicating promising reproducibility and extrapolation. In the feature importance rankings, current medical history, past medical history, personal history, and chief complaint status emerged as highly significant, underscoring the importance of textual data in predicting the risk of FRIs. Furthermore, factors such as male sex, gallbladder stones, multiple test indices, Gushukang capsule use, primary and secondary surgeries, smoking history, depression, ability to perform activities of daily living, pain, and unassisted falls were associated with FRI risk in the hospital setting.
CONCLUSIONS
The combination of AutoGluon-Tabular with multimodal approaches to analyze textual data may improve FRI prediction model performance for hospitalized patients. The model was externally verified and showed promising predictive performance, suggesting its effectiveness for reducing the FRI risk in hospitalized patients.