1. A prototype of a cooperative conveyance system by wireless-network control of multiple robots
2. Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. Preprint 2015. Available from: arXiv:1509.02971.
3. Schulman J, Levine S, Abbeel P, et al. Trust region policy optimization. International conference on machine learning; PMLR; 2015. p. 1889–1897.
4. Mnih V, Puigdomenech Badia A, Mirza M, et al. Asynchronous methods for deep reinforcement learning. International conference on machine learning; PMLR; 2016. p. 1928–1937.
5. Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms. Preprint 2017. Available from: arXiv:1707.06347.