A Joint Model to Identify and Align Bilingual Named Entities-Reference-Cited by-同舟云学术

A Joint Model to Identify and Align Bilingual Named Entities

Published:2013-06 Issue:2 Volume:39 Page:229-266
ISSN:0891-2017
Container-title:Computational Linguistics
language:en
Short-container-title:Computational Linguistics

Author:

Chen Yufeng¹,Zong Chengqing¹,Su Keh-Yih²

Affiliation:

1. National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences

2. Behavior Design Corporation

Abstract

In this article, an integrated model is derived that jointly identifies and aligns bilingual named entities (NEs) between Chinese and English. The model is motivated by the following observations: (1) whether an NE is translated semantically or phonetically depends greatly on its entity type, (2) entities within an aligned pair should share the same type, and (3) the initially detected NEs can act as anchors and provide further information while selecting NE candidates. Based on these observations, this article proposes a translation mode ratio feature (defined as the proportion of NE internal tokens that are semantically translated), enforces an entity type consistency constraint, and utilizes additional new NE likelihoods (based on the initially detected NE anchors). Experiments show that this novel method significantly outperforms the baseline. The type-insensitive F-score of identified NE pairs increases from 78.4% to 88.0% (12.2% relative improvement) in our Chinese–English NE alignment task, and the type-sensitive F-score increases from 68.4% to 83.0% (21.3% relative improvement). Furthermore, the proposed model demonstrates its robustness when it is tested across different domains. Finally, when semi-supervised learning is conducted to train the adopted English NE recognition model, the proposed model also significantly boosts the English NE recognition type-sensitive F-score.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Language and Linguistics

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00122

Reference64 articles.

1. Nymble

2. Language independent named entity classification by modified transformation-based learning and by decision tree induction

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Impact of Translation on Biomedical Information Extraction: Experiment on Real-Life Clinical Notes;JMIR Medical Informatics;2024-04-04

2. Impact of translation on biomedical information extraction: evaluation on real-life clinical notes (Preprint);2023-06-03

3. Impact of translation on biomedical information extraction from real-life clinical notes;2023-03-30

4. Improving the Robustness of Loanword Identification in Social Media Texts;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-03-24

5. EntityRank: Unsupervised Mining of Bilingual Named Entity Pairs from Parallel Corpora for Neural Machine Translation;2022 IEEE International Conference on Big Data (Big Data);2022-12-17