1. [1] T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems 26, ed. C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, pp.3111-3119, Curran Associates, Inc., Dec. 2013.
2. [2] N. Kalchbrenner and P. Blunsom, “Recurrent continuous translation models,” Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp.1700-1709, Association for Computational Linguistics, Oct. 2013.
3. [3] M. Auli, M. Galley, C. Quirk, and G. Zweig, “Joint language and translation modeling with recurrent neural networks,” Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp.1044-1054, Association for Computational Linguistics, Oct. 2013.
4. [4] J. Devlin, R. Zbib, Z. Huang, T. Lamar, R. Schwartz, and J. Makhoul, “Fast and robust neural network joint models for statistical machine translation,” Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, pp.1370-1380, Association for Computational Linguistics, June 2014. 10.3115/v1/p14-1129
5. [5] J. Zhang, D. Zhang, and J. Hao, “Local translation prediction with global sentence representation,” Proceedings of the 24th International Conference on Artificial Intelligence, pp.1398-1404, AAAI Press, July 2015.