The OMOP common data model in Australian primary care data: Building a quality research ready harmonised dataset-Reference-Cited by-同舟云学术

The OMOP common data model in Australian primary care data: Building a quality research ready harmonised dataset

Published:2024-04-18 Issue:4 Volume:19 Page:e0301557
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Ward Roger,Hallinan Christine Mary^ORCID,Ormiston-Smith David,Chidgey Christine,Boyle Dougie

Abstract

Background The use of routinely collected health data for secondary research purposes is increasingly recognised as a methodology that advances medical research, improves patient outcomes, and guides policy. This secondary data, as found in electronic medical records (EMRs), can be optimised through conversion into a uniform data structure to enable analysis alongside other comparable health metric datasets. This can be achieved with the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM), which employs a standardised vocabulary to facilitate systematic analysis across various observational databases. The concept behind the OMOP-CDM is the conversion of data into a common format through the harmonisation of terminologies, vocabularies, and coding schemes within a unique repository. The OMOP model enhances research capacity through the development of shared analytic and prediction techniques; pharmacovigilance for the active surveillance of drug safety; and ‘validation’ analyses across multiple institutions across Australia, the United States, Europe, and the Asia Pacific. In this research, we aim to investigate the use of the open-source OMOP-CDM in the PATRON primary care data repository. Methods We used standard structured query language (SQL) to construct, extract, transform, and load scripts to convert the data to the OMOP-CDM. The process of mapping distinct free-text terms extracted from various EMRs presented a substantial challenge, as many terms could not be automatically matched to standard vocabularies through direct text comparison. This resulted in a number of terms that required manual assignment. To address this issue, we implemented a strategy where our clinical mappers were instructed to focus only on terms that appeared with sufficient frequency. We established a specific threshold value for each domain, ensuring that more than 95% of all records were linked to an approved vocabulary like SNOMED once appropriate mapping was completed. To assess the data quality of the resultant OMOP dataset we utilised the OHDSI Data Quality Dashboard (DQD) to evaluate the plausibility, conformity, and comprehensiveness of the data in the PATRON repository according to the Kahn framework. Results Across three primary care EMR systems we converted data on 2.03 million active patients to version 5.4 of the OMOP common data model. The DQD assessment involved a total of 3,570 individual evaluations. Each evaluation compared the outcome against a predefined threshold. A ’FAIL’ occurred when the percentage of non-compliant rows exceeded the specified threshold value. In this assessment of the primary care OMOP database described here, we achieved an overall pass rate of 97%. Conclusion The OMOP CDM’s widespread international use, support, and training provides a well-established pathway for data standardisation in collaborative research. Its compatibility allows the sharing of analysis packages across local and international research groups, which facilitates rapid and reproducible data comparisons. A suite of open-source tools, including the OHDSI Data Quality Dashboard (Version 1.4.1), supports the model. Its simplicity and standards-based approach facilitates adoption and integration into existing data processes.

Publisher

Public Library of Science (PLoS)

Reference22 articles.

1. Mining electronic health records: towards better research applications and clinical care;PB Jensen;Nature Reviews Genetics,2012

2. Adding value to the electronic health record through secondary use of data for quality assurance, research, and surveillance;WR Hersh;Clin Pharmacol Ther,2007

3. Validation of a common data model for active safety surveillance research;JM Overhage;Journal of the American Medical Informatics Association,2012

4. Standardizing registry data to the OMOP Common Data Model: experience from three pulmonary hypertension databases.;P Biedermann;BMC medical research methodology,2021

5. Common Problems, Common Data Model Solutions: Evidence Generation for Health Technology Assessment.;S Kent;PharmacoEconomics.,2021