Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method-Reference-Cited by-同舟云学术

Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method

Published:2024-05-14 Issue:10 Volume:12 Page:1533
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Hao Longyan¹^ORCID,Wang Chaoli¹,Shi Yibo¹

Affiliation:

1. Department of Control Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

Abstract

This article investigates the optimal tracking control problem for data-based stochastic discrete-time linear systems. An average off-policy Q-learning algorithm is proposed to solve the optimal control problem with random disturbances. Compared with the existing off-policy reinforcement learning (RL) algorithm, the proposed average off-policy Q-learning algorithm avoids the assumption of an initial stability control. First, a pole placement strategy is used to design an initial stable control for systems with unknown dynamics. Second, the initial stable control is used to design a data-based average off-policy Q-learning algorithm. Then, this algorithm is used to solve the stochastic linear quadratic tracking (LQT) problem, and a convergence proof of the algorithm is provided. Finally, numerical examples show that this algorithm outperforms other algorithms in a simulation.

Funder

National Natural Science Foundation of China under grant

Publisher

MDPI AG

Link

https://www.mdpi.com/2227-7390/12/10/1533/pdf

Reference41 articles.

1. Output feedback Q-learning control for the discrete-time linear quadratic regulator problem;Rizvi;IEEE Trans. Neural Netw. Learn. Syst.,2019

2. An iterative technique for the computation of the steady state gains for the discrete optimal regulator;Hewer;IEEE Trans. Autom. Control,1971

3. Linear quadratic tracking control of unknown discrete-time systems using value iteration algorithm;Li;Neurocomputing,2018

4. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics;Jiang;Automatica,2012

5. Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning;Modares;IEEE Trans. Cybern.,2016