1. The mirage of action-dependent baselines in reinforcement learning[C];tucker;International Conference on Machine Learning,2018
2. Proximal policy optimization algorithms[J];schulman;ArXiv Preprint,2017
3. High-dimensional continuous control using generalized advantage estimation[J];schulman;ArXiv Preprint,2015
4. Setting up a Reinforcement Learning Task with a Real-World Robot[J];mahmood;2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),2018
5. Control of a Quadrotor With Reinforcement Learning