1. Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12–17). Deep Reinforcement Learning with Double Q-learning. Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI 2016, Phoenix, AZ, USA.
2. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Openai, O.K. (2017). Proximal Policy Optimization Algorithms. arXiv.
3. Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., and Abbeel, P. (2018). Soft Actor-Critic Algorithms and Applications. arXiv.
4. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., and Graepel, T. (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv.
5. Kurach, K., Raichuk, A., Stańczyk, P., Zaja̧c, M., Bachem, O., Espeholt, L., Riquelme, C., Vincent, D., Michalski, M., and Bousquet, O. (2020, January 7–12). Google Research Football: A Novel Reinforcement Learning Environment. Proceedings of the AAAI 2020—34th AAAI Conference on Artificial Intelligence, New York, NY, USA.