Abstract
In recent years, machine translation engines have leveraged both traditional statistical models and the newer neural network models, achieving significant improvements in translation quality through the use of large-scale, high-quality corpora. These advancements have led to continuous improvements in translation quality for high-resource languages. However, the translation performance for low-resource languages remains suboptimal, primarily due to the difficulty in obtaining large-scale bilingual parallel corpora necessary for training neural network models. This study aims to enhance machine translation quality for low-resource languages by utilizing large language models, exploring various methods to improve translation quality, and evaluating their effectiveness. Specifically, the research focuses on comparing the effectiveness of these two methods through human evaluation using the Multidimensional Quality Metrics (MQM) framework.
Publisher
Century Science Publishing Co
Reference10 articles.
1. Artetxe, M., Labaka, G., Agirre, E., & Cho, K. (2018). Unsupervised Neural Machine Translation. In Proceedings of the International Conference on Learning Representations (ICLR)2018. arXiv:1710.11041 [cs.CL].
2. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. Proceedings of the International Conference on Learning Representations (ICLR).
3. Currey, A., Miceli-Barone, A. V., & Heafield, K. (2017). Copied monolingual data improves low-resource neural machine translation. In Proceedings of the Second Conference on Machine Translation. Retrieved from aclanthology.org
4. Dewangan, S., Alva, S., Joshi, N., & Bhattacharyya, P. (2021). Experience of neural machine translation between Indian languages. Machine Translation, 35(3-4), 71-99.
5. González-Rubio, J., Navarro-Cerdán, J. R., & Casacuberta, F. (2013). Dimensionality reduction methods for machine translation quality estimation. Machine Translation, 27(3-4), 281-301.