Enhancing Patient Outcome Prediction through Deep Learning with Sequential Diagnosis Codes from structural EHR: A systematic review (Preprint)-Reference-Cited by-同舟云学术

Enhancing Patient Outcome Prediction through Deep Learning with Sequential Diagnosis Codes from structural EHR: A systematic review (Preprint)

Published:2024-02-19 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Hama Tuankasfee^ORCID,Alsaleh Mohanad,Allery Freya,Choi Jung Won,Tomlinson Chris^ORCID,Wu Honghan,Lai Alvina,Pontikos Nikolas^ORCID,Thygesen Johan H.^ORCID

Abstract

BACKGROUND

There has been a rapid growth in the application of structured Electronic Health Records (EHRs) to healthcare systems, where huge amounts of diagnosis codes presenting the temporal event of the patient are collected. In the era of artificial intelligence, many models, especially Deep Learning (DL), are applied for patient outcome prediction. This systematic review aimed to identify DL models developed for sequential diagnosis codes for patient outcome prediction.

OBJECTIVE

The main objective of this systematic review is to identify and summarise existing DL studies predicting patient outcome using sequences of diagnosis codes, as a key part of their predictors. Additionally, this study also investigates the challenge of generalisability and explainability of the predictive models.

METHODS

In this review, we identified all relevant studies by using the following four databases: PubMed, Embase, IEEE Xplore, and Web of Science. After that, we evaluated the included papers in various aspects: Deep learning techniques, characteristics of the dataset, prediction tasks, performance evaluation, generalizability, and explainability. We also assessed the risk of bias (PROBAST) and the concern of applicability.

RESULTS

In this review, 84 papers met the eligibility criteria and were selected, which showed the growing trend in this research area. Recurrent neural networks (RNN) (and their derivatives) (n = 47; 57.3%) and Transformers (n = 22; 26.8%) were the most popular architectures in DL-based models. Most studies present their input feature as sequence of visit embedding (n = 45; 53.6%). For the prediction tasks, the most common one is next visit diagnosis (n = 30; 23.4%), followed by heart failure (18; 14.1%), and mortality (n = 17; 13.3%). Only 7 studies evaluated their models in terms of generalisability. A positive correlation was observed between training sample size and model performance (AUROC) (p-value < 0.05). However, about 70% of included studies were found to have high risk of bias.

CONCLUSIONS

The application of deep learning in sequence of diagnosis has demonstrated remarkable promise in predicting patient outcomes. Using multiple types of features and integration of time intervals was found to improve the predictive performance. Addressing challenges related to generalisation and explainability will be instrumental in unlocking the full potential of DL for enhancing healthcare outcomes and patient care.

CLINICALTRIAL

This review was registered on PROSPERO (CRD42023434032).

Publisher

JMIR Publications Inc.

Reference107 articles.

1. A machine learning model for identifying patients at risk for wild-type transthyretin amyloid cardiomyopathy

2. Deep learning in clinical natural language processing: a methodical review

3. A Review of Deep Learning in Medical Imaging: Imaging Traits, Technology Trends, Case Studies With Progress Highlights, and Future Promises

4. Deep Learning in Physiological Signal Data: A Survey

5. A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data