Alternating Language Modeling for Cross-Lingual Pre-Training-Reference-Cited by-同舟云学术

Alternating Language Modeling for Cross-Lingual Pre-Training

Published:2020-04-03 Issue:05 Volume:34 Page:9386-9393
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Yang Jian,Ma Shuming,Zhang Dongdong,Wu ShuangZhi,Li Zhoujun,Zhou Ming

Abstract

Language model pre-training has achieved success in many natural language processing tasks. Existing methods for cross-lingual pre-training adopt Translation Language Model to predict masked words with the concatenation of the source sentence and its target equivalent. In this work, we introduce a novel cross-lingual pre-training method, called Alternating Language Modeling (ALM). It code-switches sentences of different languages rather than simple concatenation, hoping to capture the rich cross-lingual context of words and phrases. More specifically, we randomly substitute source phrases with target translations to create code-switched sentences. Then, we use these code-switched data to train ALM model to learn to predict words of different languages. We evaluate our pre-training ALM on the downstream tasks of machine translation and cross-lingual classification. Experiments show that ALM can outperform the previous pre-training methods on three benchmarks.1

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems;Proceedings of the ACM Web Conference 2024;2024-05-13

2. Multilingual Event Causality Identification via Meta-learning with Knowledge;Proceedings of the 2024 International Conference on Generative Artificial Intelligence and Information Security;2024-05-10

3. A cross-guidance cross-lingual model on generated parallel corpus for classical Chinese machine reading comprehension;Information Processing & Management;2024-03

4. SE-HCL: Schema Enhanced Hybrid Curriculum Learning for Multi-Turn Text-to-SQL;IEEE Access;2024

5. Can Pretrained English Language Models Benefit Non-English NLP Systems in Low-Resource Scenarios?;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024