Explainable detection of adverse drug reaction with imbalanced data distribution-Reference-Cited by-同舟云学术

Explainable detection of adverse drug reaction with imbalanced data distribution

Published:2022-06-15 Issue:6 Volume:18 Page:e1010144
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Wang Jin,Yu Liang-Chih^ORCID,Zhang Xuejie

Abstract

Analysis of health-related texts can be used to detect adverse drug reactions (ADR). The greatest challenge for ADR detection lies in imbalanced data distributions where words related to ADR symptoms are often minority classes. As a result, trained models tend to converge to a point that strongly biases towards the majority class and then ignores the minority class. Since the most used cross-entropy criteria is an approximation to accuracy, the model focuses more readily on the majority class to achieve high accuracy. To address this issue, existing methods apply either oversampling or down-sampling strategies to balance the data distribution and exploit the most difficult samples of the minority class. However, increasing or reducing the number of individual tokens alone in sequence labeling tasks will result in the loss of the syntactic relations of the sentence. This paper proposes a weighted variant of conditional random field (CRF) for data-imbalanced sequence labeling tasks. Such a weighting strategy can alleviate data distribution imbalances between majority and minority classes. Instead of using softmax in the output layer, the CRF can capture the relationship of labels between tokens. The locally interpretable model-agnostic explanations (LIME) algorithm was applied to investigate performance differences between models with and without the weighted loss function. Experimental results on two different ADR tasks show that the proposed model outperforms previously proposed sequence labeling methods.

Funder

National Natural Science Foundation of China

Ministry of Science and Technology, Taiwan

Publisher

Public Library of Science (PLoS)

Subject

Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics

Reference51 articles.

1. Under-Reporting of Adverse A Systematic Review;L Hazell;Drug Safety,2006

2. Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions;R Harpaz;Journal of the American Medical Informatics Association,2013

3. Pharmacovigilance using clinical notes;P LePendu;Clinical Pharmacology and Therapeutics,2013

4. Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility Study;X Wang;Journal of the American Medical Informatics Association,2009

5. Drug safety surveillance using de-identified EMR and claims data: Issues and challenges;PM Nadkarni;Journal of the American Medical Informatics Association,2010

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Utilizing nanotechnology and advanced machine learning for early detection of gastric cancer surgery;Environmental Research;2024-03

2. Multiple features-based adverse drug reaction detection from social media using deep convolutional neural networks (DCNN);Multimedia Tools and Applications;2024-01-27

3. A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition;BMC Bioinformatics;2023-02-08

4. Explainable Artificial Intelligence for Patient Safety: A Review of Application in Pharmacovigilance;IEEE Access;2023

5. Hierarchical Classification of Adverse Events Based on Consumer’s Comments;Computational Science – ICCS 2023;2023