Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space-Reference-Cited by-同舟云学术

Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space

Published:2019-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Fan Zhou¹,Su Rui¹,Zhang Weinan¹,Yu Yong¹

Affiliation:

1. Shanghai Jiao Tong University

Abstract

In this paper we propose a hybrid architecture of actor-critic algorithms for reinforcement learning in parameterized action space, which consists of multiple parallel sub-actor networks to decompose the structured action space into simpler action spaces along with a critic network to guide the training of all sub-actor networks. While this paper is mainly focused on parameterized action space, the proposed architecture, which we call hybrid actor-critic, can be extended for more general action spaces which has a hierarchical structure. We present an instance of the hybrid actor-critic architecture based on proximal policy optimization (PPO), which we refer to as hybrid proximal policy optimization (H-PPO). Our experiments test H-PPO on a collection of tasks with parameterized action space, where H-PPO demonstrates superior performance over previous methods of parameterized action reinforcement learning.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 50 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making;Tsinghua Science and Technology;2024-10

2. InSS: An Intelligent Scheduling Orchestrator for Multi-GPU Inference With Spatio-Temporal Sharing;IEEE Transactions on Parallel and Distributed Systems;2024-10

3. Adaptive Frequency Green Light Optimal Speed Advisory Based on Deep Reinforcement Learning;Journal of Transportation Engineering, Part A: Systems;2024-10

4. Optimized Online Remaining Useful Life Prediction for Nuclear Circulating Water Pump Considering Time-Varying Degradation Mechanism;IEEE Transactions on Industrial Informatics;2024-09

5. Self-organized underwater image enhancement;ISPRS Journal of Photogrammetry and Remote Sensing;2024-09