1. Findings of the 2018 conference on machine translation (WMT18);Bojar;Proceedings of the Third Conference on Machine Translation,2018
2. Gu J. , Im D.J. and Li V. , Neural machine translation with gumbel-greedy decoding, Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32. No. 1, 2018.
3. Long short-term memory;Hochreiter;Neural Comput,1997
4. The vanishing gradient problem during learning recurrent neural nets and problem solutions;Hochreiter;Int J Uncertain Fuzziness Knowl-Based Syst,1998
5. Language models are unsupervised multitask learners,;Radford;OpenAI Blog,2019