1. Specia, L. and Shah, K. (2014). Predicting human translation quality. In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, Vancouver, Canada, Association for Machine Translation in the Americas, pp. 288–300.
2. How far do we agree on the quality of translation?
3. Graham, Y. , Baldwin, T. , Moffat, A. and Zobel, J. (2013). Continuous measurement scales in human evaluation of machine translation. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Sofia, Bulgaria, pp. 33–41.
4. FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN
5. Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation