Author:
Heilbroner Samuel P.,Carter Curtis,Vidmar David M.,Mueller Erik T.,Stumpe Martin C.,Miotto Riccardo
Abstract
AbstractLaboratory data in electronic health records (EHRs) is an effective source of information to characterize patient populations, inform accurate diagnostics and treatment decisions, and fuel research studies. However, despite their value, laboratory values are underutilized due to high levels of missingness. Existing imputation methods fall short, as they do not fully leverage patient clinical histories and are commonly not scalable to the large number of tests available in real-world data (RWD). To address these shortcomings, we present Laboratory Imputation Framework using EHRs (LIFE), a deep learning framework based on multi-head attention that is trained to impute any laboratory test value at any point in time in the patient’s journey using their complete EHRs. This architecture (1) eliminates the need to train a different model for each laboratory test by jointly modeling all laboratory data of interest; and (2) better clinically contextualizes the predictions by leveraging additional EHR variables, such as diagnosis, medications, and discrete laboratory results. We validate our framework using a large-scale, real-world dataset encompassing over 1 million oncology patients. Our results demonstrate that LIFE obtains superior or equivalent results compared to state-of-the-art baselines in 23 out of 25 evaluated laboratory tests and better enhances a downstream adverse event detection task in 7 out of 9 cases, showcasing its potential in efficiently estimating missing laboratory values and, consequently, in transforming the utilization of RWD in healthcare.
Publisher
Cold Spring Harbor Laboratory