1. Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. CoRR abs/1508.04395 (2015), http://arxiv.org/abs/1508.04395
2. Bakker, B.: Reinforcement learning by backpropagation through an LSTM model/critic. In: 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Honolulu, HI, 2007 (2007)
3. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018), http://arxiv.org/abs/1810.04805
4. Dhariwal, P., et al.: Openai baselines. GitHub, GitHub repository (2017)
5. Dong, L., Lapata, M.: Coarse-to-fine decoding for neural semantic parsing. CoRR abs/1805.04793 (2018), http://arxiv.org/abs/1805.04793