Affiliation:
1. College of Computer Science and Technology, Harbin Engineering University, China
2. State Key Laboratory for Novel Software Technology, Nanjing University, China
Abstract
Recently, the emergence of the digital language division and the availability of cross-lingual benchmarks make researches of cross-lingual texts more popular. However, the performance of existing methods based on mapping relation are not good enough, because sometimes the structures of language spaces are not isomorphic. Besides, polysemy makes the extraction of interaction features hard. For cross-lingual word embedding, a model named Cross-lingual Word Embedding Space Based on Pseudo Corpus (CWE-PC) is proposed to obtain cross-lingual and multilingual word embedding. For cross-lingual sentence pair interaction feature capture, a Cross-language Feature Capture Based on Similarity Matrix (CFC-SM) model is built to extract cross-lingual interaction features. ELMo pretrained model and multiple layer convolution are used to alleviate polysemy and extract interaction features. These models are evaluated on multiple language pairs and results show that they outperform the state-of-the-art cross-lingual word embedding methods.
Reference41 articles.
1. M. Artetxe, G. Labaka and E. Agirre, Learning bilingual word embeddings with (almost) no bilingual data, in: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp. 1017–1042.
2. M. Artetxe, G. Labaka and E. Agirre, A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, pp. 789–798.
3. ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model
4. Hierarchical mapping for crosslingual word embedding alignment;Azpiazu;Transactions of the Association for Computational Linguistics,2020
5. Linear transformations for cross-lingual semantic textual similarity;Brychcin;Knowledge-Based Systems,2020
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献