Multilingual Denoising Pre-training for Neural Machine Translation-Reference-Cited by-同舟云学术

Multilingual Denoising Pre-training for Neural Machine Translation

Published:2020-12 Issue: Volume:8 Page:726-742
ISSN:2307-387X
Container-title:Transactions of the Association for Computational Linguistics
language:en
Short-container-title:Transactions of the Association for Computational Linguistics

Author:

Liu Yinhan¹,Gu Jiatao²,Goyal Naman²,Li Xian²,Edunov Sergey²,Ghazvininejad Marjan²,Lewis Mike²,Zettlemoyer Luke²

Affiliation:

1. Birch Technology.

2. Facebook AI.

Abstract

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART—a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective (Lewis et al., 2019 ). mBART is the first method for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, whereas previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text. Pre-training a complete model allows it to be directly fine-tuned for supervised (both sentence-level and document-level) and unsupervised machine translation, with no task- specific modifications. We demonstrate that adding mBART initialization produces performance gains in all but the highest-resource settings, including up to 12 BLEU points for low resource MT and over 5 BLEU points for many document-level and unsupervised models. We also show that it enables transfer to language pairs with no bi-text or that were not in the pre-training corpus, and present extensive analysis of which factors contribute the most to effective pre-training. 1

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/tacl_a_00343

Reference56 articles.

Cited by 360 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. LLMs-based machine translation for E-commerce;Expert Systems with Applications;2024-12

2. LAWSUIT: a LArge expert-Written SUmmarization dataset of ITalian constitutional court verdicts;Artificial Intelligence and Law;2024-09-09

3. DFCNet +: Cross-modal dynamic feature contrast net for continuous sign language recognition;Image and Vision Computing;2024-09

4. Automated analysis and assignment of maintenance work orders using natural language processing;Automation in Construction;2024-09

5. LLMEffiChecker : Understanding and Testing Efficiency Degradation of Large Language Models;ACM Transactions on Software Engineering and Methodology;2024-08-26