Affiliation:
1. School of Mathematics and Statistics Qilu University of Technology (Shandong Academy of Sciences) Jinan China
2. College of Sciences Shandong University of Aeronautics Binzhou China
Abstract
AbstractThe work addresses the optimized tracking control problem by combining both reinforcement learning (RL) and backstepping technique for the canonical nonlinear unknown dynamic system. Since such dynamic system contains multiple state variables with differential relation, the backstepping technique is considered by making a virtual control sequence in accordance with Lyapunov functions. In the last backstepping step, the optimized actual control is derived by performing the RL under identifier‐critic‐actor structure, where RL is to overcome the difficulty coming from solving Hamilton‐Jacobi‐Bellman (HJB) equation. Different from the traditional RL optimizing methods that find the RL updating laws from the square of the HJB equation's approximation, this optimized control is to find the RL training laws from the negative gradient of a simple positive definite function, which is equivalent to the HJB equation. The result shows that this optimized control can obviously alleviate the algorithm complexity. Meanwhile, it can remove the requirement of known dynamic as well. Finally, theory and simulation indicate the feasibility of this optimized control.
Funder
Natural Science Foundation of Shandong Province
National Natural Science Foundation of China