Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study-Reference-Cited by-同舟云学术

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study

Published:2022-10-20 Issue:10 Volume:10 Page:e41136
ISSN:2291-9694
Container-title:JMIR Medical Informatics
language:en
Short-container-title:JMIR Med Inform

Author:

Li Yongbin^ORCID,Hui Linhu^ORCID,Zou Liping^ORCID,Li Huyang^ORCID,Xu Luo^ORCID,Wang Xiaohua^ORCID,Chua Stephanie^ORCID

Abstract

Background With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. Objective This study aimed to perform 2 multiclass relation extraction tasks of Biomedical Natural Language Processing Workshop 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope (BB-rel) task and binary relation extraction of plant seed development (SeeDev-binary) task. In essence, these 2 tasks are aimed at extracting the relation between annotated entity pairs from biomedical texts, which is a challenging problem. Methods Traditional research methods adopted feature- or kernel-based methods and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech embedding, entity-type embedding, distance embedding, and position embedding. The multi-head attention mechanism is used to extract the global semantic features of an entire sentence. Meanwhile, we introduced a dependency-type feature and the shortest dependency path connecting 2 candidate entities in the syntactic dependency graph to enrich the feature representation. Results Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieving F1 scores of 65.56% and 38.04% on the test sets of the BB-rel and SeeDev-binary tasks. Especially in the SeeDev-binary task, the F1 score of our model is superior to that of other existing models and achieves state-of-the-art performance. Conclusions We demonstrated that the multi-head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts.

Publisher

JMIR Publications Inc.

Subject

Health Information Management,Health Informatics

Reference58 articles.

1. Mining knowledge from text using information extraction

2. Text-mining approaches in molecular biology and biomedicine

3. Frontiers of biomedical text mining: current progress

4. Extracting drug-drug interactions from biomedical texts

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automated information extraction model enhancing traditional Chinese medicine RCT evidence extraction (Evi-BERT): algorithm development and validation;Frontiers in Artificial Intelligence;2024-08-15

2. Elucidating the semantics-topology trade-off for knowledge inference-based pharmacological discovery;Journal of Biomedical Semantics;2024-05-01

3. Advancing Italian biomedical information extraction with transformers-based models: Methodological insights and multicenter practical application;Journal of Biomedical Informatics;2023-12

4. Associating biological context with protein-protein interactions through text mining at PubMed scale;Journal of Biomedical Informatics;2023-09