1. On the theory of policy gradient methods: Optimality, approximation, and distribution shift;Agarwal;Journal of Machine Learning Research,2021
2. Reinforcement learning and optimal control;Bertsekas,2019
3. Convex optimization;Boyd,2004
4. Fast global convergence of natural policy gradient methods with entropy regularization;Cen;Operations Research,2021
5. On the sample complexity of the linear quadratic regulator;Dean;Foundations of Computational Mathematics,2020