Abstract
AbstractElectronic health record (EHR) systems with prescription data offer vast potential in pharmacoepidemiology and pharmacogenomics. The large amount of clinical data recorded in these systems requires automatic processing to extract relevant information. This paper introduces PRESNER, a name entity recognition (NER) and classification pipeline for EHR prescription data.The pipeline uses the pre-trained transformer Bio-ClinicalBERT fine-tuned on UK Biobank prescription entries manually annotated with medication-related information (drug name, route of administration, pharmaceutical form, strength, and dosage) as the core NER system. Moreover, PRESNER also maps drugs to the Anatomical Therapeutic and Chemical (ATC) classification system and distinguishes between systemic and non-systemic drug products. It outperformed a baseline model combining the state-of-the-art Med7 and a dictionary-based approach from the ChEMBL database with a macro-average F1-score of 0.95 vs 0.71. In addition to UK Biobank prescription data, PRESNER can also be applied to other English prescription datasets, making it a versatile tool for researchers in the field.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献