1. P. Abbeel, A. Y. Ng, (2004). Apprenticeship learning via inverse reinforcement learning. In Proc. of the 21st International Conference on Machine Learning.
2. Z. Ahmed, M. N. N. Le Roux, D. Schuurmans, (2019). Understanding the impact of entropy on policy optimization. In Proc. of the 36th International Conference on Machine Learning pp.151–160.
3. R. Amit, R. Meir, K. Ciosek, (2020). Discount Factor as a Regularizer in Reinforcement Learning. In Proc. of the 37th International Conference on Machine Learning.
4. Multiple tracking and machine learning reveal dopamine modulation for area-restricted foraging behaviors via velocity change in caenorhabditis elegans;Ashida;Neuroscience Letters,2019
5. Dynamic policy programming;Azar;Journal of Machine Learning Research,2012