Improving Performance of Outcome Prediction for In-patients with Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study (Preprint)-Reference-Cited by-同舟云学术

Improving Performance of Outcome Prediction for In-patients with Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study (Preprint)

Published:2022-02-22 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Huang Yanqun^ORCID,Zheng Zhimin^ORCID,Ma Moxuan^ORCID,Xin Xin^ORCID,Liu Honglei^ORCID,Fei Xiaolu^ORCID,Wei Lan^ORCID,Chen Hui^ORCID

Abstract

BACKGROUND

The widespread secondary utilization of electronic medical records (EMRs) promotes healthcare quality improvement. Representation learning that can automatically extract hidden information from EMR data has gained increasing attention.

OBJECTIVE

We aimed to propose a patient representation containing more feature associations and task-specific feature importance to improve outcome prediction performance for in-patients with acute myocardial infarction (AMI).

METHODS

Medical concepts including patients’ age, gender, diagnosis diseases, laboratory tests, structured radiological features, procedures and medications were firstly embedded into real-value vectors using the improved skip-gram algorithm where concepts in the context windows were selected by feature association strengths measured by association rules’ confidence. Then each patient was represented as the sum of the feature embeddings weighted by the task-specific feature importance, which was applied to facilitate predictive model prediction from global and local perspectives. We finally applied the proposed patient representation into mortality risk prediction for 3010 and 1671 AMI in-patients from a public dataset and a private dataset, respectively, comparing with several reference representation methods in terms of the areas under the receiver operator curve (AUC).

RESULTS

Compared with the reference methods, the proposed embedding-based representation showed consistently superior predictive performance on two datasets, achieving the mean AUCs of 0.861 and 0.980, while the greatest AUCs among reference methods were 0.852 and 0.942 on the public and private datasets, respectively. Feature importance integrated in patient representation also reflected features that were consistently critical in prediction tasks and clinical practice.

CONCLUSIONS

The introduction of feature associations and feature importance facilitated an effective patient representation and contributed to prediction performance improvement and model interpretation.

Publisher

JMIR Publications Inc.

Reference44 articles.

1. Treatment initiation prediction by EHR mapped PPD tensor based convolutional neural networks boosting algorithm

2. Long-term prediction for temporal propagation of seasonal influenza using Transformer-based model

3. A novel hierarchical machine learning model for hospital-acquired venous thromboembolism risk assessment among multiple-departments

4. Disease network delineates the disease progression profile of cardiovascular diseases

5. Deep representation learning for individualized treatment effect estimation using electronic health records