Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance

Author:

Wei Wei-Qi1,Teixeira Pedro L1,Mo Huan1,Cronin Robert M12,Warner Jeremy L12,Denny Joshua C12

Affiliation:

1. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA

2. Department of Medicine, Vanderbilt University, Nashville, TN, USA

Abstract

Abstract Objective To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications. Materials and Methods We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer’s disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson’s disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, and F -score for each EHR component alone and in combination. Results The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06–0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance ( F score: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08). Conclusion Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Reference48 articles.

1. Accelerating the use of electronic health records in physician practices;Shea;New Engl J Med.,2010

2. The emerging role of electronic medical records in pharmacogenomics;Wilke;Clin Pharmacol Therap.,2011

3. Electronic medical records as a tool in clinical pharmacology: opportunities and challenges;Roden;Clin Pharmacol Therap.,2012

4. Next-generation phenotyping of electronic health records;Hripcsak;JAMIA.,2013

5. Electronic medical records for genetic research: results of the eMERGE consortium;Kho;Sci Trans Med.,2011

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3