Harmonizing units and values of quantitative data elements in a very large nationally pooled electronic health record (EHR) dataset

Author:

Bradwell Katie R1ORCID,Wooldridge Jacob T2,Amor Benjamin1,Bennett Tellen D3ORCID,Anand Adit2,Bremer Carolyn2,Yoo Yun Jae2,Qian Zhenglong2,Johnson Steven G4,Pfaff Emily R5ORCID,Girvin Andrew T1,Manna Amin1,Niehaus Emily A1,Hong Stephanie S6,Zhang Xiaohan Tanner7,Zhu Richard L7ORCID,Bissell Mark1,Qureshi Nabeel1,Saltz Joel2,Haendel Melissa A8ORCID,Chute Christopher G9ORCID,Lehmann Harold P7,Moffitt Richard A2ORCID,

Affiliation:

1. Palantir Technologies , Denver, Colorado, USA

2. Department of Biomedical Informatics, Stony Brook University , Stony Brook, New York, USA

3. Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado , Aurora, Colorado, USA

4. Institute for Health Informatics, University of Minnesota , Minneapolis, Minnesota, USA

5. Department of Medicine, North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill , Chapel Hill, North Carolina, USA

6. School of Medicine, Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine , Baltimore, Maryland, USA

7. Department of Medicine, Johns Hopkins , Baltimore, Maryland, USA

8. Center for Health AI, University of Colorado , Aurora, Colorado, USA

9. Schools of Medicine, Public Health, and Nursing, Johns Hopkins University , Baltimore, Maryland, USA

Abstract

Abstract Objective The goals of this study were to harmonize data from electronic health records (EHRs) into common units, and impute units that were missing. Materials and Methods The National COVID Cohort Collaborative (N3C) table of laboratory measurement data—over 3.1 billion patient records and over 19 000 unique measurement concepts in the Observational Medical Outcomes Partnership (OMOP) common-data-model format from 55 data partners. We grouped ontologically similar OMOP concepts together for 52 variables relevant to COVID-19 research, and developed a unit-harmonization pipeline comprised of (1) selecting a canonical unit for each measurement variable, (2) arriving at a formula for conversion, (3) obtaining clinical review of each formula, (4) applying the formula to convert data values in each unit into the target canonical unit, and (5) removing any harmonized value that fell outside of accepted value ranges for the variable. For data with missing units for all the results within a lab test for a data partner, we compared values with pooled values of all data partners, using the Kolmogorov-Smirnov test. Results Of the concepts without missing values, we harmonized 88.1% of the values, and imputed units for 78.2% of records where units were absent (41% of contributors’ records lacked units). Discussion The harmonization and inference methods developed herein can serve as a resource for initiatives aiming to extract insight from heterogeneous EHR collections. Unique properties of centralized data are harnessed to enable unit inference. Conclusion The pipeline we developed for the pooled N3C data enables use of measurements that would otherwise be unavailable for analysis.

Funder

National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Reference15 articles.

1. The National COVID Cohort Collaborative (N3C): rationale, design, infrastructure, and deployment;Haendel;J Am Med Inform Assoc,2021

2. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers;Hripcsak;Stud Health Technol Inform,2015

3. Development of the Logical Observation Identifier Names and Codes (LOINC) vocabulary;Huff;J Am Med Inform Assoc,1998

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3