Assessing the Effect of Electronic Health Record Data Quality on Identifying Patients with Type 2 Diabetes (Preprint)

Author:

Sood Priyanka DuaORCID,Liu StarORCID,Lehmann HaroldORCID,Kharrazi HadiORCID

Abstract

BACKGROUND

Increasing and substantial reliance on Electronic health records (EHR) and data types (i.e., diagnosis (Dx), medication (Rx), laboratory (Lx)) demands assessment of its data quality (DQ) as a fundamental approach; especially since there is need to identify appropriate denominator population with chronic conditions, such as Type-2 Diabetes (T2D), using commonly available computable phenotype definitions (phenotype).

OBJECTIVE

To bridge this gap, our study aims to assess how issues of EHR DQ, and variations and robustness (or lack thereof) in phenotypes may have potential impact in identifying denominator population.

METHODS

Approximately 208k patients with T2D were included in our study using retrospective EHR data of Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes, and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (i.e., age, sex, race, ethnicity), healthcare utilization (inpatient and emergency room visits), and average Charlson Comorbidity score of each phenotype. We then used different methods to induce/simulate DQ issues of completeness, accuracy and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped Dx, Rx, and Lx codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a Dx or Rx code with another code of the same data type and induced 2% incremental change from -10% to +10% in Lx result values; and lastly, for timeliness, data was modeled for induced incremental shift of date records by 30 days up to a year.

RESULTS

Less than a quarter (23%) of population overlapped across all phenotypes using EHR. The population identified by each phenotype varied across all combination of data types. Induced incompleteness identified fewer patients with each increment, for e.g., at 100% diagnostic incompleteness, Chronic Conditions Data Warehouse (CCW) phenotype identified zero patients as its phenotypic characteristics included only Dx codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype and therefore, resulting in fewer patients being identified with each incremental change.

CONCLUSIONS

We utilized EHR data with Dx, Rx, and Lx data types from a large tertiary hospital system to understand the T2D phenotypic differences and performance. We learned how issues of DQ, using induced DQ methods, may impact identification of the denominator populations upon which clinical (e.g., clinical research and trials, population health evaluations) and financial/operational decisions are made. The novel results from our study may inform in shaping a common T2D computable phenotype definition that can be applicable to clinical informatics, managing chronic conditions, and additional healthcare industry-wide efforts.

CLINICALTRIAL

Not applicable

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3