1. LoweR WuYI TamarA HarbJ Pieter AbbeelO MordatchI.Multi‐agent actor‐critic for mixed cooperative‐competitive environments. Proceedings of the 31st Conference on Neural Information Processing Systems; 2017. p. 30.
2. SchulmanJ WolskiF DhariwalP RadfordA KlimovO.Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 2017.
3. HaarnojaT ZhouA AbbeelP LevineS.Soft actor‐critic: Off‐policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the 35th International Conference on Machine Learning; 2018. p. 1861–70.
4. BrockmanG CheungV PetterssonL SchneiderJ SchulmanJ TangJ et al.Openai gym. arXiv preprint arXiv:1606.01540 2016.
5. OpenAI.Robogym.2020https://github.com/openai/robogym