1. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv.
2. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv.
3. Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10–15). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden.
4. Wolf, P., Hubschneider, C., Weber, M., Bauer, A., Härtl, J., Dürr, F., and Zöllner, J.M. (2017, January 11–14). Learning how to drive in a real world simulation with deep q-networks. Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA.
5. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv.