Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching-Reference-Cited by-同舟云学术

Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching

Published:2022-01-25 Issue:1 Volume:10 Page:e31063
ISSN:2291-9694
Container-title:JMIR Medical Informatics
language:en
Short-container-title:JMIR Med Inform

Author:

Siegersma Klaske R^ORCID,Evers Maxime^ORCID,Bots Sophie H^ORCID,Groepenhoff Floor^ORCID,Appelman Yolande^ORCID,Hofstra Leonard^ORCID,Tulevski Igor I^ORCID,Somsen G Aernout^ORCID,den Ruijter Hester M^ORCID,Spruit Marco^ORCID,Onland-Moret N Charlotte^ORCID

Abstract

Background Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available. Objective The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN). Methods Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype. Results The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline. Conclusions The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs.

Publisher

JMIR Publications Inc.

Subject

Health Information Management,Health Informatics

Reference36 articles.

1. Under-Reporting of Adverse Drug Reactions

2. Under-reporting of harm in clinical trials

3. Healthy Volunteer Effect and Cardiovascular Risk

4. Sex differences in adverse drug reactions reported to the National Pharmacovigilance Centre in the Netherlands: An explorative observational study

5. Hospital Admissions Associated with Adverse Drug Reactions: A Systematic Review of Prospective Observational Studies

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Extracting adverse drug events from clinical Notes: A systematic review of approaches used;Journal of Biomedical Informatics;2024-03

2. Classification of Severe Maternal Morbidity from Electronic Health Records Written in Spanish Using Natural Language Processing;Applied Sciences;2023-09-27

3. Use of the Electronic Health Record for Monitoring Adverse Drug Reactions;Current Allergy and Asthma Reports;2023-05-16