Sequential Data–Based Patient Similarity Framework for Patient Outcome Prediction: Algorithm Development-Reference-Cited by-同舟云学术

Sequential Data–Based Patient Similarity Framework for Patient Outcome Prediction: Algorithm Development

Published:2022-01-06 Issue:1 Volume:24 Page:e30720
ISSN:1438-8871
Container-title:Journal of Medical Internet Research
language:en
Short-container-title:J Med Internet Res

Author:

Wang Ni^ORCID,Wang Muyu^ORCID,Zhou Yang^ORCID,Liu Honglei^ORCID,Wei Lan^ORCID,Fei Xiaolu^ORCID,Chen Hui^ORCID

Abstract

Background Sequential information in electronic medical records is valuable and helpful for patient outcome prediction but is rarely used for patient similarity measurement because of its unevenness, irregularity, and heterogeneity. Objective We aimed to develop a patient similarity framework for patient outcome prediction that makes use of sequential and cross-sectional information in electronic medical record systems. Methods Sequence similarity was calculated from timestamped event sequences using edit distance, and trend similarity was calculated from time series using dynamic time warping and Haar decomposition. We also extracted cross-sectional information, namely, demographic, laboratory test, and radiological report data, for additional similarity calculations. We validated the effectiveness of the framework by constructing k–nearest neighbors classifiers to predict mortality and readmission for acute myocardial infarction patients, using data from (1) a public data set and (2) a private data set, at 3 time points—at admission, on Day 7, and at discharge—to provide early warning patient outcomes. We also constructed state-of-the-art Euclidean-distance k–nearest neighbor, logistic regression, random forest, long short-term memory network, and recurrent neural network models, which were used for comparison. Results With all available information during a hospitalization episode, predictive models using the similarity model outperformed baseline models based on both public and private data sets. For mortality predictions, all models except for the logistic regression model showed improved performances over time. There were no such increasing trends in predictive performances for readmission predictions. The random forest and logistic regression models performed best for mortality and readmission predictions, respectively, when using information from the first week after admission. Conclusions For patient outcome predictions, the patient similarity framework facilitated sequential similarity calculations for uneven electronic medical record data and helped improve predictive performance.

Publisher

JMIR Publications Inc.

Subject

Health Informatics

Reference39 articles.

1. Electronic Health Record Driven Prediction for Gestational Diabetes Mellitus in Early Pregnancy

2. High-Risk Breast Lesions: A Machine Learning Model to Predict Pathologic Upgrade and Reduce Unnecessary Surgical Excision

3. Deep Patient Similarity Learning for Personalized Healthcare

4. Combining structured and unstructured data for predictive models: a deep learning approach

5. Multitask learning and benchmarking with clinical time series data

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

2. The predictive value of machine learning for mortality risk in patients with acute coronary syndromes: a systematic review and meta-analysis;European Journal of Medical Research;2023-10-20

3. Representation of time-varying and time-invariant EMR data and its application in modeling outcome prediction for heart failure patients;Journal of Biomedical Informatics;2023-07

4. Predicting outcomes at the individual patient level: what is the best method?;BMJ Mental Health;2023-06

5. Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study;Journal of Medical Internet Research;2022-08-03