Affiliation:
1. Centre de Recherche en Automatique de Nancy (CRAN) UMR 7039, CNRS Faculté des Sciences et Technologies, Université de Lorraine Vandoeuvre Cedex France
Abstract
AbstractAdaptive dynamic programming (ADP) based approaches are effective for solving nonlinear Hamilton–Jacobi–Bellman (HJB) in an approximative sense. This paper develops a novel ADP‐based approach, in that the focus is on minimizing the consecutive changes in control inputs over a finite horizon to solve the optimal tracking problem for completely unknown discrete time systems. To that end, the cost function considers within its arguments: tracking performance, energy consumption and as a novelty, consecutive changes in the control inputs. Through suitable system transformation, the optimal tracking problem is transformed to a regulation problem with respect to state tracking error. The latter leads to a novel performance index function over finite horizon and corresponding nonlinear HJB equation that is solved in an approximative iterative sense using a novel iterative ADP‐based algorithm. A suitable Neural network‐based structure is proposed to learn the initial admissible one step zero control law. The proposed iterative ADP is implemented using heuristic dynamic programming technique based on actor‐critic Neural Network structure. Finally, simulation studies are presented to illustrate the effectiveness of the proposed algorithm.
Subject
Applied Mathematics,Control and Optimization,Software,Control and Systems Engineering
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Off‐policy model‐based end‐to‐end safe reinforcement learning;International Journal of Robust and Nonlinear Control;2023-11-20