Determining prescriptions in electronic healthcare record (EHR) data: methods for development of standardized, reproducible drug codelists

Author:

Graul Emily LORCID,Stone Philip WORCID,Massen Georgie MORCID,Hatam SaraORCID,Adamson AlexanderORCID,Denaxas SpirosORCID,Peters Nicholas SORCID,Quint Jennifer KORCID

Abstract

ABSTRACTObjectiveEpidemiological research using electronic healthcare records(EHR) informing everyday patient care uses combinations of codes (“codelists”) to define diseases and prescriptions (or phenotypes). Yet methodology for codelist generation varies, manifesting in misclassification bias, while there are drug-specific codelist considerations.Materials and MethodsWe developed methods to generate drug codelists, testing this using the Clinical Practice Research Datalink (CPRD) Aurum database, accounting for missing data in “attribute” search variables. We generated codelists for 1)cardiovascular disease and 2)inhaled Chronic Obstructive Pulmonary Disease (COPD) therapies, applying them to a sample cohort of 335,931 COPD patients. We compared searching on all search variables (A,”gold standard”) to B) chemical and C) ontological information only.ResultsIn Search A we determined 165,150 patients prescribed cardiovascular drugs(49.2% of cohort), and 317,963 prescribed COPD inhalers (94.7% of cohort). Considering output per value set, Search C missed substantial prescriptions, including vasodilator anti-hypertensives (A and B:19,696 prescriptions; C:1,145) and SAMA inhalers (A and B:35,310; C:564).DiscussionWe recommend the full methods (A) for comprehensiveness. There are special considerations when generating adaptable and generalizable drug codelists, including fluctuating status, cohort-specific drug indications, underlying hierarchical ontology, and statistical analyses.ConclusionsMethods must have end-to-end clinical input, and be standardizable, reproducible, and understandable to all researchers across data contexts.LAY ABSTRACTHealth research using patient records informs everyday medicine, using groups of codes (“codelists”) to define diseases and drugs. Yet methods to create drug codelists are inconsistent, may not include physician expertise, nor be reported.We developed a reproducible method to create drug codelists, testing it using de-identified healthcare records. We generated codelists for 1) heart conditions and 2) inhalers to identify prescriptions in a sample group of 335,931 patients with chronic lung disease. We compared our full methods (Search A) to two restricted searches to show prescriptions can be missed if necessary considerations are not made.In search A, we determined 165,150 people (49.2% of sample group) prescribed drugs from the heart codelist. For lung inhalers, we determined 317,963 prescriptions (94.7% of group). Search C missed substantial prescriptions, for drugs lowering blood pressure by opening vessels (A and B:19,696 prescriptions; C: 1,145), and short-term inhalers opening airways (A and B: 35,310; C:564).We recommend full methods(A) for completeness. Drug codelist methods must be consistent, duplicable, and include physician input at all research stages, and have special considerations including status (eg, new, taken off market), disease, and drug categorical system. Quality methods should be freely accessible and usable across study contexts.

Publisher

Cold Spring Harbor Laboratory

Reference44 articles.

1. Subphenotyping depression using machine learning and electronic health records

2. NCATS National COVID Cohort Collaborative (N3C) Data Enclave. COVID-19 Clinical Data Warehouse Data Dictionary: Based on OMOP Common Data Model Specifications Version 5.3. https://ncats.nih.gov/files/OMOP_CDM_COVID.pdf

3. Polypharmacy-associated risk of hospitalisation among people ageing with and without HIV: an observational study

4. WSIC Data Specification, v11. https://www.registerfordiscover.org.uk/uploads/files/1539001703datadictionary.pdf

5. CPRD Aurum Data Specification, v2.8. Published online August 10, 2022. https://cprd.com/sites/default/files/2022-08/CPRD%20Aurum%20Data%20Specification%20v2.8.pdf

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3