A monotonic policy optimization algorithm for high-dimensional continuous control problem in 3D MuJoCo-Reference-Cited by-同舟云学术

A monotonic policy optimization algorithm for high-dimensional continuous control problem in 3D MuJoCo

Published:2018-06-04 Issue:20 Volume:78 Page:28665-28680
ISSN:1380-7501
Container-title:Multimedia Tools and Applications
language:en
Short-container-title:Multimed Tools Appl

Author:

Yuan Qunyong,Xiao Nanfeng

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Media Technology,Software

Link

http://link.springer.com/content/pdf/10.1007/s11042-018-6098-y.pdf

Reference24 articles.

1. Achiam J (2016) Easy monotonic policy iteration [J]. arXiv:1602. 09118

2. Duan Y, Chen X, Houthooft R, et al. (2016) Benchmarking deep reinforcement learning for continuous control [J]. Proceedings of The 33rd International Conference on Machine Learning, p 1329–1338

3. Haviv M, Van Der Heyden, L (1984) Perturbation bounds for the stationary probabilities of a finite markov chain. Adv Appl Probab 16(4):804–818. ISSN 00018678. URL http://www.jstor.org/stable/142734

4. Kakade, Sham (2001a) A natural policy gradient. In: NIPS, volume 14, p 1531–1538

5. Kakade S, Langford J (2002) Approximately optimal approximate reinforcement learning. Nineteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., p 267–274

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reinforcement learning based monotonic policy for online resource allocation;Future Generation Computer Systems;2023-01

2. A High-Fidelity Simulation Platform for Industrial Manufacturing by Incorporating Robotic Dynamics Into an Industrial Simulation Tool;IEEE Robotics and Automation Letters;2022-10