Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning-Reference-Cited by-同舟云学术

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

Published:2018-04-29 Issue:1 Volume:32 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Brown Daniel,Niekum Scott

Abstract

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance. However, to our knowledge, no practical methods exist for determining high-confidence policy performance bounds in the inverse reinforcement learning setting---where the true reward function is unknown and only samples of expert behavior are given. We propose a sampling method based on Bayesian inverse reinforcement learning that uses demonstrations to determine practical high-confidence upper bounds on the alpha-worst-case difference in expected return between any evaluation policy and the optimal policy under the expert's unknown reward function. We evaluate our proposed bound on both a standard grid navigation task and a simulated driving task and achieve tighter and more accurate bounds than a feature count-based baseline. We also give examples of how our proposed bound can be utilized to perform risk-aware policy selection and risk-aware policy improvement. Because our proposed bound requires several orders of magnitude fewer demonstrations than existing high-confidence bounds, it is the first practical method that allows agents that learn from demonstration to express confidence in the quality of their learned policy.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Survey of Autonomous Vehicle Behaviors: Trajectory Planning Algorithms, Sensed Collision Risks, and User Expectations;Sensors;2024-07-24

2. Data-Driven Policy Learning Methods from Biological Behavior: A Systematic Review;Applied Sciences;2024-05-09

3. Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning;Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction;2024-03-11

4. A Comprehensive Review on Deep Learning-Based Motion Planning and End-to-End Learning for Self-Driving Vehicle;IEEE Access;2024

5. Inverse Reinforcement Learning for Optimal Control Systems;Advances in Industrial Control;2024