Using syntax for improving phrase-based SMT in low-resource languages-Reference-Cited by-同舟云学术

Using syntax for improving phrase-based SMT in low-resource languages

Published:2019-07-10 Issue: Volume: Page:
ISSN:2055-7671
Container-title:Digital Scholarship in the Humanities
language:en
Short-container-title:

Author:

Fadaei Hakimeh¹,Faili Heshaam²

Affiliation:

1. School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran

2. School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran

Abstract

AbstractData driven approaches for machine translation, such as statistical and neural machine translation, suffer from sparsity when dealing with low-resource languages. In these cases, using other sources of information including linguistic information could alleviate the problem. In this article, we focus on the problem of word ordering in translation from a high-resource to a low-resource language and try to improve the quality by using syntactic information from the high-resource side. We propose some syntactic features based on Tree Adjoining Grammar (TAG) to be employed in a phrase-based SMT model in order to improve the word ordering. In this work, a set of synchronous TAG rules is extracted and used to estimate the probability of the phrase orders suggested by the phrase-based model. The main idea of the article is to handle the word ordering by using the extended domain of locality property of TAG and abstracting the long distance dependencies into a local view, which is a TAG elementary tree. The experiments on English–Persian and English–German translation showed that, by combining the proposed TAG-based reordering features with lexical and hierarchical reordering models, we gain significant improvements over the baseline and in comparison with a neural reordering model and a pre-reordering model.

Funder

Iran National Science Foundation

Publisher

Oxford University Press (OUP)

Subject

Computer Science Applications,Linguistics and Language,Language and Linguistics,Information Systems

Link

http://academic.oup.com/dsh/advance-article-pdf/doi/10.1093/llc/fqz033/28923924/fqz033.pdf

Reference66 articles.

1. Decoder integration, expected bleu training for recurrent neural network language models;Auli;In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,2014

2. A survey of word reordering in statistical machine translation: Computational models and language phenomena;Bisazza;Computational Linguistics,2016

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The impact of task complexity and translating self-efficacy belief on students’ translation performance: Evidence from process and product data;Frontiers in Psychology;2022-11-03

2. LSTM-Based Attentional Embedding for English Machine Translation;Scientific Programming;2022-03-16

3. Linguistically enhanced word segmentation for better neural machine translation of low resource agglutinative languages;International Journal of Speech Technology;2021-07-05