Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing-Reference-Cited by-同舟云学术

Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing

Published:2022-08-04 Issue:8 Volume:17 Page:e0270595
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Chaichulee Sitthichok^ORCID,Promchai Chissanupong,Kaewkomon Tanyamai,Kongkamol Chanon,Ingviya Thammasin^ORCID,Sangsupawanich Pasuree

Abstract

Allergic reactions to medication range from mild to severe or even life-threatening. Proper documentation of patient allergy information is critical for safe prescription, avoiding drug interactions, and reducing healthcare costs. Allergy information is regularly obtained during the medical interview, but is often poorly documented in electronic health records (EHRs). While many EHRs allow for structured adverse drug reaction (ADR) reporting, a free-text entry is still common. The resulting information is neither interoperable nor easily reusable for other applications, such as clinical decision support systems and prescription alerts. Current approaches require pharmacists to review and code ADRs documented by healthcare professionals. Recently, the effectiveness of machine algorithms in natural language processing (NLP) has been widely demonstrated. Our study aims to develop and evaluate different NLP algorithms that can encode unstructured ADRs stored in EHRs into institutional symptom terms. Our dataset consists of 79,712 pharmacist-reviewed drug allergy records. We evaluated three NLP techniques: Naive Bayes—Support Vector Machine (NB-SVM), Universal Language Model Fine-tuning (ULMFiT), and Bidirectional Encoder Representations from Transformers (BERT). We tested different general-domain pre-trained BERT models, including mBERT, XLM-RoBERTa, and WanchanBERTa, as well as our domain-specific AllergyRoBERTa, which was pre-trained from scratch on our corpus. Overall, BERT models had the highest performance. NB-SVM outperformed ULMFiT and BERT for several symptom terms that are not frequently coded. The ensemble model achieved an exact match ratio of 95.33%, a F1score of 98.88%, and a mean average precision of 97.07% for the 36 most frequently coded symptom terms. The model was then further developed into a symptom term suggestion system and achieved a Krippendorff’s alpha agreement coefficient of 0.7081 in prospective testing with pharmacists. Some degree of automation could both accelerate the availability of allergy information and reduce the efforts for human coding.

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference32 articles.

1. Drug Allergy;DA Khan;Journal of Allergy and Clinical Immunology,2010

2. Epidemiology and Risk Factors for Drug Allergy;BYH Thong;British Journal of Clinical Pharmacology,2011

3. Drug Allergy;R Warrington;Allergy, Asthma & Clinical Immunology,2011

4. Drug Allergy;PA Greenberger;Allergy and Asthma Proceedings,2019

5. Automated Identification of Drug and Food Allergies Entered Using Non-standard Terminology;RH Epstein;Journal of the American Medical Informatics Association,2013

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Construction of a Multi-Label Classifier for Extracting Multiple Incident Factors From Medication Incident Reports in Residential Care Facilities: Natural Language Processing Approach;JMIR Medical Informatics;2024-07-23

2. Extracting patient lifestyle characteristics from Dutch clinical text with BERT models;BMC Medical Informatics and Decision Making;2024-06-03

3. Application of the transformer model algorithm in chinese word sense disambiguation: a case study in chinese language;Scientific Reports;2024-03-15

4. Construction of a Multi-Label Classifier for Extracting Multiple Incident Factors From Medication Incident Reports in Residential Care Facilities: Natural Language Processing Approach (Preprint);2024-03-07

5. Optimizing classification of diseases through language model analysis of symptoms;Scientific Reports;2024-01-17