Data-driven planning via imitation learning-Reference-Cited by-同舟云学术

Data-driven planning via imitation learning

Published:2018-07-12 Issue:13-14 Volume:37 Page:1632-1672
ISSN:0278-3649
Container-title:The International Journal of Robotics Research
language:en
Short-container-title:The International Journal of Robotics Research

Author:

Choudhury Sanjiban¹,Bhardwaj Mohak¹,Arora Sankalp¹,Kapoor Ashish²,Ranade Gireeja²,Scherer Sebastian¹,Dey Debadeepta²

Affiliation:

1. Carnegie Mellon University, Pittsburgh, PA, USA

2. Microsoft Research, Redmond, WA, USA

Abstract

Robot planning is the process of selecting a sequence of actions that optimize for a task=specific objective. For instance, the objective for a navigation task would be to find collision-free paths, whereas the objective for an exploration task would be to map unknown areas. The optimal solutions to such tasks are heavily influenced by the implicit structure in the environment, i.e. the configuration of objects in the world. State-of-the-art planning approaches, however, do not exploit this structure, thereby expending valuable effort searching the action space instead of focusing on potentially good actions. In this paper, we address the problem of enabling planners to adapt their search strategies by inferring such good actions in an efficient manner using only the information uncovered by the search up until that time. We formulate this as a problem of sequential decision making under uncertainty where at a given iteration a planning policy must map the state of the search to a planning action. Unfortunately, the training process for such partial-information-based policies is slow to converge and susceptible to poor local minima. Our key insight is that if we could fully observe the underlying world map, we would easily be able to disambiguate between good and bad actions. We hence present a novel data-driven imitation learning framework to efficiently train planning policies by imitating a clairvoyant oracle: an oracle that at train time has full knowledge about the world map and can compute optimal decisions. We leverage the fact that for planning problems, such oracles can be efficiently computed and derive performance guarantees for the learnt policy. We examine two important domains that rely on partial-information-based policies: informative path planning and search-based motion planning. We validate the approach on a spectrum of environments for both problem domains, including experiments on a real UAV, and show that the learnt policy consistently outperforms state-of-the-art algorithms. Our framework is able to train policies that achieve up to [Formula: see text] more reward than state-of-the art information-gathering heuristics and a [Formula: see text] speedup as compared with A* on search-based planning problems. Our approach paves the way forward for applying data-driven techniques to other such problem domains under the umbrella of robot planning.

Funder

Office of Naval Research

National Aeronautics and Space Administration

Publisher

SAGE Publications

Subject

Applied Mathematics,Artificial Intelligence,Electrical and Electronic Engineering,Mechanical Engineering,Modelling and Simulation,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/0278364918781001

Reference132 articles.

1. Apprenticeship learning via inverse reinforcement learning

2. Multi-Heuristic A*

3. Learning heuristic functions for large state spaces

Cited by 32 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making;IEEE Robotics and Automation Letters;2024-10

2. Learning-based methods for adaptive informative path planning;Robotics and Autonomous Systems;2024-09

3. Manipulating Neural Path Planners via Slight Perturbations;IEEE Robotics and Automation Letters;2024-06

4. DiPPeR: Diffusion-based 2D Path Planner applied on Legged Robots;2024 IEEE International Conference on Robotics and Automation (ICRA);2024-05-13

5. Robotic Learning for Informative Path Planning;2024