Affiliation:
1. LIONS, EPFL
2. Max Planck Institute for Software Systems (MPI-SWS)
Abstract
We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher. More formally, we tackle the following algorithmic question: How could a teacher provide an informative sequence of demonstrations to an IRL learner to speed up the learning process? We present an interactive teaching framework where a teacher adaptively chooses the next demonstration based on learner's current policy. In particular, we design teaching algorithms for two concrete settings: an omniscient setting where a teacher has full knowledge about the learner's dynamics and a blackbox setting where the teacher has minimal knowledge. Then, we study a sequential variant of the popular MCE-IRL learner and prove convergence guarantees of our teaching algorithm in the omniscient setting. Extensive experiments with a car driving simulator environment show that the learning progress can be speeded up drastically as compared to an uninformative teacher.
Publisher
International Joint Conferences on Artificial Intelligence Organization
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning;Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction;2024-03-11
2. Advancements in Humanoid Robots: A Comprehensive Review and Future Prospects;IEEE/CAA Journal of Automatica Sinica;2024-02
3. Reinforcement Learning in Education: A Literature Review;Informatics;2023-09-18
4. Calibrated Human-Robot Teaching: What People Do When Teaching Norms to Robots*;2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN);2023-08-28
5. Norm Learning with Reward Models from Instructive and Evaluative Feedback;2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN);2022-08-29