Subsequence and distant supervision based active learning for relation extraction of Chinese medical texts-Reference-Cited by-同舟云学术

Subsequence and distant supervision based active learning for relation extraction of Chinese medical texts

Published:2023-02-14 Issue:1 Volume:23 Page:
ISSN:1472-6947
Container-title:BMC Medical Informatics and Decision Making
language:en
Short-container-title:BMC Med Inform Decis Mak

Author:

Ye Qi,Cai Tingting,Ji Xiang,Ruan Tong,Zheng Hong

Abstract

AbstractIn recent years, relation extraction on unstructured texts has become an important task in medical research. However, relation extraction requires a large amount of labeled corpus, manually annotating sequences is time consuming and expensive. Therefore, efficient and economical methods for annotating sequences are required to ensure the performance of relational extraction. This paper proposes a method of subsequence and distant supervision based active learning. The method is annotated by selecting information-rich subsequences as a sampling unit instead of the full sentences in traditional active learning. Additionally, the method saves the labeled subsequence texts and their corresponding labels in a dictionary which is continuously updated and maintained, and pre-labels the unlabeled set through text matching based on the idea of distant supervision. Finally, the method combines a Chinese-RoBERTa-CRF model for relation extraction in Chinese medical texts. Experimental results test on the CMeIE dataset achieves the best performance compared to existing methods. And the best F1 value obtained between different sampling strategies is 55.96%.

Funder

the National Key Research and Development Program of China

the Zhejiang Lab

Publisher

Springer Science and Business Media LLC

Subject

Health Informatics,Health Policy,Computer Science Applications

Link

https://link.springer.com/content/pdf/10.1186/s12911-023-02127-1.pdf

Reference38 articles.

1. Song B, Li F, Liu Y, Zeng X. Deep learning methods for biomedical named entity recognition: a survey and qualitative comparison. Brief Bioinform. 2021;22(6):282.

2. Wang C, Fan J. Medical relation extraction with manifold models. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Long Papers; 2014. vol. 1, pp. 828–838.

3. Yang C, Xiao D, Luo Y, Li B, Zhao X, Zhang H. A hybrid method based on semi-supervised learning for relation extraction in chinese emrs. BMC Med Inform Decis Mak. 2022;22:169.

4. Zhao Y, Zhang A, Xie R, Liu K, Wang X. Connecting embeddings for knowledge graph entity typing. arXiv preprint arXiv:2007.10873. 2020.

5. Geng Z, Zhang Y, Han Y. Joint entity and relation extraction model based on rich semantics. Neurocomputing. 2021;429:132–40.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Entity relationship extraction from Chinese electronic medical records based on feature augmentation and cascade binary tagging framework;Mathematical Biosciences and Engineering;2023