DSISA: A New Neural Machine Translation Combining Dependency Weight and Neighbors-Reference-Cited by-同舟云学术

DSISA: A New Neural Machine Translation Combining Dependency Weight and Neighbors

Published:2023-12-29 Issue: Volume: Page:
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Li Lingfang¹²^ORCID,Zhang Aijun³^ORCID,Luo Ming-Xing¹

Affiliation:

1. School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China

2. School of Information Engineering, Inner Mongolia University of Science & Technology, Baotou 014010, China

3. School of Information Engineering, Inner Mongolia University of Science & Technology

Abstract

Most of the previous neural machine translations (NMT) rely on parallel corpus. Integrating explicitly prior syntactic structure information can improve the neural machine translation. In this paper, we propose a Syntax Induced Self-Attention (SISA) which explores the influence of dependence relation between words through the attention mechanism and fine-tunes the attention allocation of the sentence through the obtained dependency weight. We present a new model, Double Syntax Induced Self-Attention (DSISA), which fuses the features extracted by SISA and a compact convolution neural network (CNN). SISA can alleviate long dependency in sentence, while CNN captures the limited context based on neighbors. DSISA utilizes two different neural networks to extract different features for richer semantic representation and replaces the first layer of Transformer encoder. DSISA not only makes use of the global feature of tokens in sentences but also the local feature formed with adjacent tokens. Finally, we perform simulation experiments that verify the performance of the new model on standard corpora.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3638762

Reference50 articles.

1. Neural machine translation: A review of methods, resources, and tools

2. C. Zhou , F. Meng , J. Zhou , 2022 . Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics , Dublin, Ireland , 2878 - 2889 . C. Zhou, F. Meng, J. Zhou, et al. 2022. Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2878-2889.

3. J. Hu , H. Hayashi , K. Cho , 2022 . Deep: Denoising entity pre-training for neural machine translation . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics , Dublin, Ireland,1753-1766 , J. Hu, H. Hayashi, K. Cho, et al. 2022. Deep: Denoising entity pre-training for neural machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland,1753-1766,

4. N. Kalchbrenner , L. Espeholt , K. Simonyan , 2016 . Neural machine translation in linear time . Retrieved Mar 15, 2017 from https://arxiv.org/abs/1610. 10099. N. Kalchbrenner, L. Espeholt, K. Simonyan, et al. 2016. Neural machine translation in linear time. Retrieved Mar 15, 2017 from https://arxiv.org/abs/1610. 10099.

5. J. Bastings , I. Titov , 2017 . Graph Convolutional Encoders for Syntax-aware Neural Machine Translation . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , Copenhagen, Denmark , 1957 - 1967 . J. Bastings, I. Titov, et al. 2017. Graph Convolutional Encoders for Syntax-aware Neural Machine Translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 1957-1967.