Learning preferences for manipulation tasks from online coactive feedback-Reference-Cited by-同舟云学术

Learning preferences for manipulation tasks from online coactive feedback

Published:2015-05-26 Issue:10 Volume:34 Page:1296-1313
ISSN:0278-3649
Container-title:The International Journal of Robotics Research
language:en
Short-container-title:The International Journal of Robotics Research

Author:

Jain Ashesh¹,Sharma Shikhar¹,Joachims Thorsten¹,Saxena Ashutosh¹

Affiliation:

1. Department of Computer Science, Cornell University, USA

Abstract

We consider the problem of learning preferences over trajectories for mobile manipulators such as personal robots and assembly line robots. The preferences we learn are more intricate than simple geometric constraints on trajectories; they are rather governed by the surrounding context of various objects and human interactions in the environment. We propose a coactive online learning framework for teaching preferences in contextually rich environments. The key novelty of our approach lies in the type of feedback expected from the user: the human user does not need to demonstrate optimal trajectories as training data, but merely needs to iteratively provide trajectories that slightly improve over the trajectory currently proposed by the system. We argue that this coactive preference feedback can be more easily elicited than demonstrations of optimal trajectories. Nevertheless, theoretical regret bounds of our algorithm match the asymptotic rates of optimal trajectory algorithms. We implement our algorithm on two high-degree-of-freedom robots, PR2 and Baxter, and present three intuitive mechanisms for providing such incremental feedback. In our experimental evaluation we consider two context rich settings, household chores and grocery store checkout, and show that users are able to train the robot with just a few feedbacks (taking only a few minutes).

Publisher

SAGE Publications

Subject

Applied Mathematics,Artificial Intelligence,Electrical and Electronic Engineering,Mechanical Engineering,Modelling and Simulation,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/0278364915581193

Reference60 articles.

1. Autonomous Helicopter Aerobatics through Apprenticeship Learning

2. Keyframe-based Learning from Demonstration

3. The Stochastic Motion Roadmap: A Sampling Framework for Planning with Markov Motion Uncertainty

4. A survey of robot learning from demonstration

5. A robot path planning framework that learns from experience

Cited by 45 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adaptive Bitrate Algorithms via Deep Reinforcement Learning With Digital Twins Assisted Trajectory;IEEE Transactions on Network Science and Engineering;2024-07

2. Assimilating human feedback from autonomous vehicle interaction in reinforcement learning models;Autonomous Agents and Multi-Agent Systems;2024-06-26

3. A Review of Natural-Language-Instructed Robot Execution Systems;AI;2024-06-26

4. Batch Active Learning of Reward Functions from Human Preferences;ACM Transactions on Human-Robot Interaction;2024-06-14

5. Aligning Human and Robot Representations;Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction;2024-03-11