1. Bleu: a method for automatic evaluation of machine translation;Papineni
2. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments;Banerjee
3. A study of translation edit rate with targeted human annotation;Snover
4. Estimating the sentence-level quality of machine translation systems;Specia
5. Large language models are state-of-the-art evaluators of translation quality;Kocmi,2023