Named entity recognition using transfer learning and small human‐ and meta‐pseudo‐labeled datasets-Reference-Cited by-同舟云学术

Named entity recognition using transfer learning and small human‐ and meta‐pseudo‐labeled datasets

Published:2024-02 Issue:1 Volume:46 Page:59-70
ISSN:1225-6463
Container-title:ETRI Journal
language:en
Short-container-title:ETRI Journal

Author:

Bae Kyoungman¹^ORCID,Lim Joon‐Ho¹

Affiliation:

1. Language Intelligence Research Section Electronics and Telecommunications Research Institute Daejeon Republic of Korea

Abstract

AbstractWe introduce a high‐performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta‐pseudo‐label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human‐ and pseudo‐labeled data. Second, the influence of noisy pseudo‐labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human‐labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Funder

Institute for Information and Communications Technology Promotion

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.4218/etrij.2023-0321

Reference38 articles.

1. Named‐entity recognition [last accessed 10 August 2023] Available at:https://en.wikipedia.org/wiki/Named-entity_recognition

2. XMaandE.Hovy End‐to‐end sequence labeling via bi‐directional LSTM‐CNNs‐CRF arXiv Preprint 2016 DOIhttps://doi.org/10.48550/arXiv.1603.01354.

3. A Survey on Deep Learning for Named Entity Recognition

4. Natural language processing (almost) from scratch;Collobert R.;J. Mach. Learn. Res.,2011

5. Y.Lin S.Yang V.Stoyanov andH.Ji A multi‐lingual multi‐task architecture for low‐resource sequence labeling (Proc. 56th Annual Meeting of the Association for Computational Linguistics Melbourne Australia) 2018 pp.799–809.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Special issue on speech and language AI technologies;ETRI Journal;2024-02