Low-Resource Named Entity Recognition via the Pre-Training Model-Reference-Cited by-同舟云学术

Low-Resource Named Entity Recognition via the Pre-Training Model

Published:2021-05-02 Issue:5 Volume:13 Page:786
ISSN:2073-8994
Container-title:Symmetry
language:en
Short-container-title:Symmetry

Author:

Chen Siqi,Pei Yijie^ORCID,Ke Zunwang^ORCID,Silamu Wushour

Abstract

Named entity recognition (NER) is an important task in the processing of natural language, which needs to determine entity boundaries and classify them into pre-defined categories. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to obtain high performance. However, there is minimal annotated data available about Uyghur and Hungarian (UH languages) NER tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for named entity recognition tasks: fine-tuning the pre-trained language model. Therefore, we propose a fine-tuning method for a low-resource language model, which constructs a fine-tuning dataset through data augmentation; then the dataset of a high-resource language is added; and finally the cross-language pre-trained model is fine-tuned on this dataset. In addition, we propose an attention-based fine-tuning strategy that uses symmetry to better select relevant semantic and syntactic information from pre-trained language models and apply these symmetry features to name entity recognition tasks. We evaluated our approach on Uyghur and Hungarian datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the available resources for named entity recognition and some of the open research questions.

Funder

National Key Research and Development Program of China

National Language Commission Research Project

Publisher

MDPI AG

Subject

Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2073-8994/13/5/786/pdf

Reference40 articles.

1. Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer;Feng;IJCAI,2018

2. Neural cross-lingual named entity recognition with minimal resources;Xie;arXiv,2018

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. KCB-FLAT: Enhancing Chinese Named Entity Recognition with Syntactic Information and Boundary Smoothing Techniques;Mathematics;2024-08-30

2. Named Entity Recognition (NER) in Low Resource Languages of Ho;Advances in Computational Intelligence and Robotics;2024-02-27

3. Named entity recognition and emotional viewpoint monitoring in online news using artificial intelligence;PeerJ Computer Science;2024-01-10

4. An Approach to a Linked Corpus Creation for a Literary Heritage Based on the Extraction of Entities from Texts;Applied Sciences;2024-01-09

5. Nested Named-Entity Recognition in Multilingual Code-Switched NLP;Lecture Notes in Electrical Engineering;2023-11-30