Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey-Reference-Cited by-同舟云学术

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Published:2021-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Liu Yongshuai¹,Halev Avishai¹²,Liu Xin¹

Affiliation:

1. University of California, Davis

2. Total E&P R&T USA

Abstract

Reinforcement Learning (RL) algorithms have had tremendous success in simulated domains. These algorithms, however, often cannot be directly applied to physical systems, especially in cases where there are constraints to satisfy (e.g. to ensure safety or limit resource consumption). In standard RL, the agent is incentivized to explore any policy with the sole goal of maximizing reward; in the real world, however, ensuring satisfaction of certain constraints in the process is also necessary and essential. In this article, we overview existing approaches addressing constraints in model-free reinforcement learning. We model the problem of learning with constraints as a Constrained Markov Decision Process and consider two main types of constraints: cumulative and instantaneous. We summarize existing approaches and discuss their pros and cons. To evaluate policy performance under constraints, we introduce a set of standard benchmarks and metrics. We also summarize limitations of current methods and present open questions for future research.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 35 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. UNIFY: A unified policy designing framework for solving integrated Constrained Optimization and Machine Learning problems;Knowledge-Based Systems;2024-11

2. Q-Sorting: An Algorithm for Reinforcement Learning Problems with Multiple Cumulative Constraints;Mathematics;2024-06-28

3. Learning Agents in Robot Navigation: Trends and Next Challenges;Journal of Robotics and Mechatronics;2024-06-20

4. Assessment of reinforcement learning algorithms for nuclear power plant fuel optimization;Applied Intelligence;2024-01

5. Learning safe control for multi-robot systems: Methods, verification, and open challenges;Annual Reviews in Control;2024