Affiliation:
1. College of Mechanical and Electrical Engineering China Jiliang University Hangzhou China
2. Zhejiang Province Key Laboratory of On‐line Testing Equipment Calibration Technology Research China Jiliang University Hangzhou China
Abstract
AbstractThis paper introduces mix‐zero‐sum differential (MZSD) game theory to address multi‐player tracking systems, offering a better understanding of the coexistence of cooperation and competition among players. Within this framework, we present an optimal safety tracking control (OSTC) method, which incorporates a control barrier function (CBF) into the value function to ensure that the tracking error remains within a specified range, thus guaranteeing safety while achieving optimization. Simultaneously, to eliminate the need for system dynamics, we propose a novel approach leveraging off‐policy integral reinforcement learning (IRL) technology to obtain the Nash equilibrium solution of the MZSD games. We establish a unique critics–actors neural network (NN) structure that updates concurrently. Furthermore, we analyze stability and convergence using the Lyapunov method. We conduct two simulations to demonstrate the effectiveness of the proposed algorithm.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Zhejiang Province