Hierarchical Maneuver Decision Method Based on PG-Option for UAV Pursuit-Evasion Game
Author:
Li Bo1ORCID, Zhang Haohui1, He Pingkuan1, Wang Geng1, Yue Kaiqiang1, Neretin Evgeny2ORCID
Affiliation:
1. School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China 2. School of Robotic and Intelligent Systems, Moscow Aviation Institute, 125993 Moscow, Russia
Abstract
Aiming at the autonomous decision-making problem in an Unmanned aerial vehicle (UAV) pursuit-evasion game, this paper proposes a hierarchical maneuver decision method based on the PG-option. Firstly, considering various situations of the relationship of both sides comprehensively, this paper designs four maneuver decision options: advantage game, quick escape, situation change and quick pursuit, and the four options are trained by Soft Actor-Critic (SAC) to obtain the corresponding meta-policy. In addition, to avoid high dimensions in the state space in the hierarchical model, this paper combines the policy gradient (PG) algorithm with the traditional hierarchical reinforcement learning algorithm based on the option. The PG algorithm is used to train the policy selector as the top-level strategy. Finally, to solve the problem of frequent switching of meta-policies, this paper sets the delay selection of the policy selector and introduces the expert experience to design the termination function of the meta-policies, which improves the flexibility of switching policies. Simulation experiments show that the PG-option algorithm has a good effect on UAV pursuit-evasion game and adapts to various environments by switching corresponding meta-policies according to current situation.
Funder
National Nature Science Foundation of China Central Universities Technology on Electromagnetic Space Operations and Applications Laboratory Key Research and Development Program of Shaanxi Province key core technology research plan
Subject
Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering
Reference34 articles.
1. Chen, B. (2020, January 14–16). Research on AI Application in the Field of Quadcopter UAVs. Proceedings of the 2020 IEEE 2nd International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Weihai, China. 2. Li, B., Gan, Z., Chen, D., and Sergey Aleksandrovich, D. (2020). UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning. Remote Sens., 12. 3. Li, B., Song, C., Bai, S., Huang, J., Ma, R., Wan, K., and Neretin, E. (2023). Multi-UAV Trajectory Planning during Cooperative Tracking Based on a Fusion Algorithm Integrating MPC and Standoff. Drones, 7. 4. Liu, X., Su, Y., Wu, Y., and Guo, Y. (2023). Multi-Conflict-Based Optimal Algorithm for Multi-UAV Cooperative Path Planning. Drones, 7. 5. Li, S., Wu, Q., Du, B., Wang, Y., and Chen, M. (2023). Autonomous Maneuver Decision-Making of UCAV with Incomplete Information in Human-Computer Gaming. Drones, 7.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|