1. Addressing function approximation error in actor-critic methods;fujimoto,0
2. Learning to Cooperate: A Hierarchical Cooperative Dual Robot Arm Approach for Underactuated Pick-and-Placing
3. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor;haarnoja;International Conference on Machine Learning,2018
4. High-dimensional continuous control using generalized advantage estimation;schulman,0
5. Adam: A method for stochastic optimization;kingma,0