1. Brockman G. Cheung V. Pettersson L. Schneider J. Schulman J. Tang J. and Zaremba W. Openai gym 2016. Brockman G. Cheung V. Pettersson L. Schneider J. Schulman J. Tang J. and Zaremba W. Openai gym 2016.
2. Erwin Coumans Yunfei Bai and Jasmine Hsu. PyBullet. Available on: https://pypi.org/project/pybullet/. Retrieved: 5/30/2019 Erwin Coumans Yunfei Bai and Jasmine Hsu. PyBullet. Available on: https://pypi.org/project/pybullet/. Retrieved: 5/30/2019
3. Haarnoja T. Zhou A. Abbeel P. and Levine S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 2018 Haarnoja T. Zhou A. Abbeel P. and Levine S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 2018
4. Kingma Diederik and Ba Jimmy. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014. Kingma Diederik and Ba Jimmy. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014.
5. Lillicrap T. P. Hunt J. J. Pritzel A. Heess N. Erez T. Tassa Y. Silver D. and Wierstra D. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 2015. Available online: https://arxiv.org/abs/1509.02971 Lillicrap T. P. Hunt J. J. Pritzel A. Heess N. Erez T. Tassa Y. Silver D. and Wierstra D. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 2015. Available online: https://arxiv.org/abs/1509.02971