1. Proximal policy optimization algorithms;schulman;ArXiv Preprint,2017
2. Trust region policy optimization;schulman;International Conference on Machine Learning,0
3. The surprising effectiveness of ppo in cooperative, multi-agent games;yu;ArXiv Preprint,2021
4. Multi-Task Deep Reinforcement Learning with PopArt
5. A natural policy gradient;kakade;Advances in neural information processing systems,2001