Synchronous Bidirectional Neural Machine Translation

Author:

Zhou Long12,Zhang Jiajun13,Zong Chengqing145

Affiliation:

1. National Laboratory of Pattern Recognition, CASIA, Beijing, China

2. University of Chinese Academy of Sciences, Beijing, China. long.zhou@nlpr.ia.ac.cn

3. University of Chinese Academy of Sciences, Beijing, China. jjzhang@nlpr.ia.ac.cn

4. University of Chinese Academy of Sciences, Beijing, China

5. CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, China. cqzong@nlpr.ia.ac.cn

Abstract

Abstract Existing approaches to neural machine translation (NMT) generate the target language sequence token-by-token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional–neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49, and 1.04 BLEU points, respectively, and obtains the state-of-the-art per- formance on Chinese-English and English- German translation tasks.1

Publisher

MIT Press - Journals

Reference37 articles.

1. Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu , AnirudhGoyal, RyanLowe, JoellePineau, AaronCourville, and YoshuaBengio. 2017. An actor-critic algorithm for sequence prediction. In Proceedings of ICLR 2017.

2. Dzmitry Bahdanau, Kyunghyun Cho , and YoshuaBengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR 2015.

3. Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu , and MacduffHughes. 2018. The best of both worlds: Combining recent advances in neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 76–86. Association for Computational Linguistics.

4. Yongchao Deng, Shanbo Cheng, Jun Lu , KaiSong, JingangWang, ShenglanWu, LiangYao, GuchunZhang, HaiboZhang, PeiZhang, ChangfengZhu, and BoxingChen. 2018. Alibaba’s neural machine translation systems for wmt18. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 368–376. Association for Computational Linguistics.

5. Jacob Devlin, Ming-Wei Chang, Kenton Lee , and KristinaToutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Cited by 57 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3