A Novel Sentence Transformer-based Natural Language Processing Approach for Schema Mapping of Electronic Health Records to the OMOP Common Data Model-Reference-Cited by-同舟云学术

A Novel Sentence Transformer-based Natural Language Processing Approach for Schema Mapping of Electronic Health Records to the OMOP Common Data Model

Published:2024-03-24 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Zhou Xinyu^ORCID,Dhingra Lovedeep Singh^ORCID,Aminorroaya Arya^ORCID,Adejumo Philip,Khera Rohan^ORCID

Abstract

AbstractMapping electronic health records (EHR) data to common data models (CDMs) enables the standardization of clinical records, enhancing interoperability and enabling large-scale, multi-centered clinical investigations. Using 2 large publicly available datasets, we developed transformer-based natural language processing models to map medication-related concepts from the EHR at a large and diverse healthcare system to standard concepts in OMOP CDM. We validated the model outputs against standard concepts manually mapped by clinicians. Our best model reached out-of-box accuracies of 96.5% in mapping the 200 most common drugs and 83.0% in mapping 200 random drugs in the EHR. For these tasks, this model outperformed a state-of-the-art large language model (SFR-Embedding-Mistral, 89.5% and 66.5% in accuracy for the two tasks), a widely-used software for schema mapping (Usagi, 90.0% and 70.0% in accuracy), and direct string match (7.5% and 7.5% accuracy). Transformer-based deep learning models outperform existing approaches in the standardized mapping of EHR elements and can facilitate an end-to-end automated EHR transformation pipeline.

Publisher

Cold Spring Harbor Laboratory

Reference18 articles.

1. Standardizing registry data to the OMOP Common Data Model: experience from three pulmonary hypertension databases;BMC medical research methodology,2021

2. Facilitating phenotype transfer using a common data model;Journal of biomedical informatics,2019

3. Deep-learning-based automated terminology mapping in OMOP-CDM;Journal of the American Medical Informatics Association,2021

4. Xiao G , Pfaff E , Prud’hommeaux E , Booth D , Sharma DK , Huo N , et al. FHIR-Ontop-OMOP: Building clinical knowledge graphs in FHIR RDF with the OMOP Common data Model. Journal of Biomedical Informatics. 2022;134:104201.

5. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases