1. Abdolmaleki, A., Lioutikov, R., Peters, J., Lau, N., Reis, L., & Neumann, G. (2015). Model-based relative entropy stochastic search. In Advances in Neural Information Processing Systems (NIPS), MIT Press.
2. Abdolmaleki, A., Springenberg, J. T., Tassa, Y., Munos, R., Heess, N., & Riedmiller, M. (2018). Maximum a posteriori policy optimisation. In Proceedings of the international conference on learning representations (ICLR).
3. Akrour, R., Abdolmaleki, A., Abdulsamad, H., & Neumann, G. (2016). Model-free trajectory optimization for reinforcement learning. In Proceedings of the international conference on machine learning (ICML).
4. Akrour, R., Abdolmaleki, A., Abdulsamad, H., Peters, J., & Neumann, G. (2018). Model-free trajectory-based policy optimization with monotonic improvement. Journal of Machine Learning Research, 19(14), 1–25.
5. Amari, S. (1998). Natural gradient works efficiently in learning. Neural Computation, 10(2), 251–276.