Active Learning for Name Entity Recognition with External Knowledge

Author:

Ma Ying1,Zhang Yu2,Sangaiah Arun Kumar3,Yan Ming4,Li Guoqi5,Wang Tian6

Affiliation:

1. Harbin Institute of Technology; Xiamen University of Technology, China

2. Xiamen University of Technology, China

3. National Yunlin University of Science and Technology, China

4. Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore

5. Institute of Automation, China Academy of Sciences, China

6. BNU-UIC Institute of Artificial Intelligence and Future Networks, Beijing Normal University; Guangdong Key Lab of AI and Multi-Modal Data Processing, BNU-HKBU United International College (UIC), China

Abstract

Named Entity Recognition (NER) is an important task in knowledge extraction, which targets extracting structural information from unstructured text. To fully employ the prior-knowledge of the pre-trained language models, some research works formulate the NER task into the machine reading comprehension form (MRC-form) to enhance their model generalization capability of commonsense knowledge. However, this transformation still faces the data-hungry issue with limited training data for the specific NER tasks. To address the low-resource issue in NER, we introduce a method named active multi-task-based NER (AMT-NER), which is a two-stage multi-task active learning training model. Specifically, A multi-task learning module is first introduced into AMT-NER to improve its representation capability in low-resource NER tasks. Then, a two-stage training strategy is proposed to optimize AMT-NER multi-task learning. An associated task of Natural Language Inference (NLI) is also employed to enhance its commonsense knowledge further. More importantly, AMT-NER introduces an active learning module, uncertainty selective, to actively filter training data to help the NER model learn efficiently. Besides, we also find different external supportive data under different pipelines improves model performance differently in the NER tasks. Extensive experiments are performed to show the superiority of our method, which also proves our findings that the introduction of external knowledge is significant and effective in the MRC-form NER tasks.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference42 articles.

1. Chinatsu Aone Mary Ellen Okurowski James Gorlinsky and Bjornar Larsen. 1999. A trainable summarizer with knowledge acquired from robust NLP techniques. Advances in Automatic Text Summarization. Chinatsu Aone Mary Ellen Okurowski James Gorlinsky and Bjornar Larsen. 1999. A trainable summarizer with knowledge acquired from robust NLP techniques. Advances in Automatic Text Summarization.

2. Unified Transformer Multi-Task Learning for Intent Classification With Entity Recognition

3. Samuel R Bowman Gabor Angeli Christopher Potts and Christopher D Manning. 2015. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326(2015). Samuel R Bowman Gabor Angeli Christopher Potts and Christopher D Manning. 2015. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326(2015).

4. Cambridge Y. A. Chung and Artificial Intelligence Laboratory. 2017. Supervised and Unsupervised Transfer Learning for Question Answering. (2017). Cambridge Y. A. Chung and Artificial Intelligence Laboratory. 2017. Supervised and Unsupervised Transfer Learning for Question Answering. (2017).

5. Jifan Chen Eunsol Choi and Greg Durrett. 2021. Can NLI Models Verify QA Systems’ Predictions?arXiv preprint arXiv:2104.08731(2021). Jifan Chen Eunsol Choi and Greg Durrett. 2021. Can NLI Models Verify QA Systems’ Predictions?arXiv preprint arXiv:2104.08731(2021).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3