1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
2. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al (eds) Advances in neural information processing systems. vol. 30. Available from: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
3. Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025
4. Geng X, Wang L, Wang X, Yang M, Feng X, Qin B et al (2022) Learning to refine source representations for neural machine translation. Int J Mach Learn Cybern 13(8):2199–2212
5. Wang W, Jiao W, Hao Y, Wang X, Shi S, Tu Z et al (2022) Understanding and improving sequence-to-sequence pretraining for neural machine translation. arXiv preprint arXiv:2203.08442