Affiliation:
1. IIIT Hyderabad. khullar.payal@gmail.com
Abstract
Abstract
This article describes an experiment to evaluate the impact of different types of ellipses discussed in theoretical linguistics on Neural Machine Translation (NMT), using English to Hindi/Telugu as source and target languages. Evaluation with manual methods shows that most of the errors made by Google NMT are located in the clause containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents. These findings not only confirm the importance of ellipses and their resolution for MT, but also hint toward a possible correlation between the translation of discourse devices like ellipses with the morphological incongruity of the source and target. We also observe that not all ellipses are translated poorly and benefit from reconstruction, advocating for a disparate treatment of different ellipses in MT research.
Subject
Artificial Intelligence,Computer Science Applications,Linguistics and Language,Language and Linguistics
Reference38 articles.
1. An annotated corpus for the analysis of VP ellipsis;Bos;Language Resources and Evaluation,2011
2. The minimalist program;Chomsky,1995
3. NP-ellipsis with adjectival remnants: A micro-comparative perspective;Corver;Natural Language & Linguistic Theory,2011
4. Verb phrase ellipsis resolution using discriminative and margin-infused algorithms;Dean,2016
5. From noun phrase ellipsis to verb phrase ellipsis: The acquisition path from context to abstract reconstruction;Goksun,2007