1. Ba, J.L., Kiros, R., Hinton, G.E.: Layer normalization. CoRR abs/1607.06450 (2016). http://arxiv.org/abs/1607.06450
2. Chiang, D., Marton, Y., Resnik, P.: Online large-margin training of syntactic and structural translation features, pp. 224–233. Association for Computational Linguistics (2008)
3. Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. ArXiv e-prints, May 2017
4. Hassan, H., et al.: Achieving human parity on automatic Chinese to English news translation. arXiv preprint arXiv:1803.05567 (2018)
5. Lecture Notes in Computer Science;K He,2016