Temporal logic motion control using actor–critic methods-Reference-Cited by-同舟云学术

Temporal logic motion control using actor–critic methods

Published:2015-05-26 Issue:10 Volume:34 Page:1329-1344
ISSN:0278-3649
Container-title:The International Journal of Robotics Research
language:en
Short-container-title:The International Journal of Robotics Research

Author:

Wang Jing¹,Ding Xuchu²,Lahijanian Morteza³,Paschalidis Ioannis Ch.¹,Belta Calin A.¹

Affiliation:

1. Division of System Engineering, Department of Mechanical Engineering, and Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA

2. Embedded Systems and Networks Group, United Technologies Research Center, East Hartford, CT, USA

3. Department of Computer Science, Rice University, Houston, TX, USA

Abstract

This paper considers the problem of deploying a robot from a specification given as a temporal logic statement about some properties satisfied by the regions of a large, partitioned environment. We assume that the robot has noisy sensors and actuators and model its motion through the regions of the environment as a Markov decision process (MDP). The robot control problem becomes finding the control policy which maximizes the probability of satisfying the temporal logic task on the MDP. For a large environment, obtaining transition probabilities for each state–action pair, as well as solving the necessary optimization problem for the optimal policy, are computationally intensive. To address these issues, we propose an approximate dynamic programming framework based on a least-squares temporal difference learning method of the actor–critic type. This framework operates on sample paths of the robot and optimizes a randomized control policy with respect to a small set of parameters. The transition probabilities are obtained only when needed. Simulations confirm that convergence of the parameters translates to an approximately optimal policy.

Publisher

SAGE Publications

Subject

Applied Mathematics,Artificial Intelligence,Electrical and Electronic Engineering,Mechanical Engineering,Modelling and Simulation,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/0278364915581505

Reference34 articles.

1. Controller Synthesis for Probabilistic Systems (Extended Abstract)

2. Baier C, Katoen JP (2008) Principles of Model Checking. Cambridge, MA: MIT Press, pp. 2620–2649.

3. Neuronlike adaptive elements that can solve difficult learning control problems

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Hierarchical Motion Planning Under Probabilistic Temporal Tasks and Safe-Return Constraints;IEEE Transactions on Automatic Control;2023-11

2. Collaborative Rover-copter Path Planning and Exploration with Temporal Logic Specifications Based on Bayesian Update Under Uncertain Environments;ACM Transactions on Cyber-Physical Systems;2022-04-11

3. Actor-Critic Traction Control Based on Reinforcement Learning with Open-Loop Training;Modelling and Simulation in Engineering;2021-12-07

4. Modular Deep Reinforcement Learning for Continuous Motion Planning With Temporal Logic;IEEE Robotics and Automation Letters;2021-10

5. Reinforcement Learning Based Temporal Logic Control with Maximum Probabilistic Satisfaction;2021 IEEE International Conference on Robotics and Automation (ICRA);2021-05-30