Abstract
AbstractElectronic health records offer great promise for early disease detection, treatment evaluation, information discovery, and other important facets of precision health. Clinical notes, in particular, may contain nuanced information about a patient’s condition, treatment plans, and history that structured data may not capture. As a result, and with advancements in natural language processing, clinical notes have been increasingly used in supervised prediction models. To predict long-term outcomes such as chronic disease and mortality, it is often advantageous to leverage data occurring at multiple time points in a patient’s history. However, these data are often collected at irregular time intervals and varying frequencies, thus posing an analytical challenge. Here, we propose the use of large language models (LLMs) for robust temporal harmonization of clinical notes across multiple visits. We compare multiple state-of-the-art LLMs in their ability to generate useful information during time gaps, and evaluate performance in supervised deep learning models for clinical prediction.
Publisher
Cold Spring Harbor Laboratory
Reference31 articles.
1. Electronic health records;A systematic review on quality requirements. Methods Inf Med,2010
2. Huang K , Altosaar J , Ranganath R. ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. CHIL Workshop. 2020.
3. Kim B. Medical Codes Predictions from Clinical Notes: From Human Coders to Machines. BayLearn. 2022.
4. Saigaonkar S , Narawade V. Predicting chronic diseases using clinical notes and fine-tuned transformers. IEEE Bombay Section. 2022.
5. Towards unstructured mortality prediction with free-text clinical notes;Journal of Biomedical Informatics,2020