Affiliation:
1. Air Defense and AntiMissile School, Air Force Engineering University, Xi’an 710043, China
2. Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, China
Abstract
In this paper, we introduce an agent rescue scheduling approach grounded in proximal policy optimization, coupled with a singularity-free predefined-time control strategy. The primary objective of this methodology is to bolster the efficiency and precision of rescue missions. Firstly, we have designed an evaluation function closely related to the average flying distance of agents, which provides a quantitative benchmark for assessing different scheduling schemes and assists in optimizing the allocation of rescue resources. Secondly, we have developed a scheduling strategy optimization method using the Proximal Policy Optimization (PPO) algorithm. This method can automatically learn and adjust scheduling strategies to adapt to complex rescue environments and varying task demands. The evaluation function provides crucial feedback signals for the PPO algorithm, ensuring that the algorithm can precisely adjust the scheduling strategies to achieve optimal results. Thirdly, aiming to attain stability and precision in agent navigation to designated positions, we formulate a singularity-free predefined-time fuzzy adaptive tracking control strategy. This approach dynamically modulates control parameters in reaction to external disturbances and uncertainties, thus ensuring the precise arrival of agents at their destinations within the predefined time. Finally, to substantiate the validity of our proposed approach, we crafted a simulation environment in Python 3.7, engaging in a comparative analysis between the PPO and the other optimization method, Deep Q-network (DQN), utilizing the variation in reward values as the benchmark for evaluation.
Funder
National Natural Science Foundation of China
Youth Talent Lifting Project of the China Association for Science and Technology