Relative Entropy Policy Search-Reference-Cited by-同舟云学术

Relative Entropy Policy Search

Published:2010-07-05 Issue:1 Volume:24 Page:1607-1612
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Peters Jan,Mulling Katharina,Altun Yasemin

Abstract

Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature convergence and implausible solutions. As first suggested in the context of covariant policy gradients, many of these problems may be addressed by constraining the information loss. In this paper, we continue this path of reasoning and suggest the Relative Entropy Policy Search (REPS) method. The resulting method differs significantly from previous policy gradient approaches and yields an exact update step. It can be shown to work well on typical reinforcement learning benchmark problems.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 37 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Policy Searched-Based Optimization Algorithm for Obstacle Avoidance in Robot Manipulators;IEEE Transactions on Industrial Electronics;2024-09

2. Complex behavior from intrinsic motivation to occupy future action-state path space;Nature Communications;2024-07-29

3. Model Predictive Control for Dynamic Cloth Manipulation: Parameter Learning and Experimental Validation;IEEE Transactions on Control Systems Technology;2024-07

4. Leaky PPO: A Simple and Efficient RL Algorithm for Autonomous Vehicles;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

5. Reinforcement learning for decision-making under deep uncertainty;Journal of Environmental Management;2024-05