Author:
Yu Chuanming,Xue Haodong,Wang Manyi,An Lu
Abstract
Purpose
Owing to the uneven distribution of annotated corpus among different languages, it is necessary to bridge the gap between low resource languages and high resource languages. From the perspective of entity relation extraction, this paper aims to extend the knowledge acquisition task from a single language context to a cross-lingual context, and to improve the relation extraction performance for low resource languages.
Design/methodology/approach
This paper proposes a cross-lingual adversarial relation extraction (CLARE) framework, which decomposes cross-lingual relation extraction into parallel corpus acquisition and adversarial adaptation relation extraction. Based on the proposed framework, this paper conducts extensive experiments in two tasks, i.e. the English-to-Chinese and the English-to-Arabic cross-lingual entity relation extraction.
Findings
The Macro-F1 values of the optimal models in the two tasks are 0.880 1 and 0.789 9, respectively, indicating that the proposed CLARE framework for CLARE can significantly improve the effect of low resource language entity relation extraction. The experimental results suggest that the proposed framework can effectively transfer the corpus as well as the annotated tags from English to Chinese and Arabic. This study reveals that the proposed approach is less human labour intensive and more effective in the cross-lingual entity relation extraction than the manual method. It shows that this approach has high generalizability among different languages.
Originality/value
The research results are of great significance for improving the performance of the cross-lingual knowledge acquisition. The cross-lingual transfer may greatly reduce the time and cost of the manual construction of the multi-lingual corpus. It sheds light on the knowledge acquisition and organization from the unstructured text in the era of big data.
Subject
Library and Information Sciences,Computer Science Applications
Reference66 articles.
1. Borrow from rich cousin: transfer learning for emotion detection using cross-lingual embedding;Expert Systems with Applications,2020
2. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings,2018
3. Automatic extraction of gene-disease associations from literature using joint ensemble learning;PloS One,2018
4. Extraction of semantic biomedical relations from text using conditional random fields;BMC Bioinformatics,2008
5. Crosslingual named entity recognition for clinical de-identification applied to a COVID-19 Italian data set;Applied Soft Computing,2020
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献