Evaluating optimal reference translations-Reference-Cited by-同舟云学术

Evaluating optimal reference translations

Published:2024-05-08 Issue: Volume: Page:1-24
ISSN:2977-0424
Container-title:Natural Language Processing
language:en
Short-container-title:Nat. lang. processing

Author:

Zouhar Vilém^ORCID,Kloudová Věra,Popel Martin,Bojar Ondřej

Abstract

Abstract The overall translation quality reached by current machine translation (MT) systems for high-resourced language pairs is remarkably good. Standard methods of evaluation are not suitable nor intended to uncover the many translation errors and quality deficiencies that still persist. Furthermore, the quality of standard reference translations is commonly questioned and comparable quality levels have been reached by MT alone in several language pairs. Navigating further research in these high-resource settings is thus difficult. In this paper, we propose a methodology for creating more reliable document-level human reference translations, called “optimal reference translations,” with the simple aim to raise the bar of what should be deemed “human translation quality.” We evaluate the obtained document-level optimal reference translations in comparison with “standard” ones, confirming a significant quality increase and also documenting the relationship between evaluation and translation editing.

Publisher

Cambridge University Press (CUP)

Reference27 articles.

1. Specia, L. and Shah, K. (2014). Predicting human translation quality. In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, Vancouver, Canada, Association for Machine Translation in the Americas, pp. 288–300.

2. How far do we agree on the quality of translation?

3. Graham, Y. , Baldwin, T. , Moffat, A. and Zobel, J. (2013). Continuous measurement scales in human evaluation of machine translation. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Sofia, Bulgaria, pp. 33–41.

4. FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN

5. Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluating optimal reference translations – ERRATUM;Natural Language Processing;2024-05-27