Affiliation:
1. Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, China
2. National Key Laboratory of Unmanned Aerial Vehicle Technology, Northwestern Polytechnical University, Xi’an 710072, China
3. Integrated Research and Development Platform of Unmanned Aerial Vehicle Technology, Northwestern Polytechnical University, Xi’an 710072, China
Abstract
Given the rapid advancements in kinetic pursuit technology, this paper introduces an innovative maneuvering strategy, denoted as LSRC-TD3, which integrates line-of-sight (LOS) angle rate correction with deep reinforcement learning (DRL) for high-speed unmanned aerial vehicle (UAV) pursuit–evasion (PE) game scenarios, with the aim of effectively evading high-speed and high-dynamic pursuers. In the challenging situations of the game, where both speed and maximum available overload are at a disadvantage, the playing field of UAVs is severely compressed, and the difficulty of evasion is significantly increased, placing higher demands on the strategy and timing of maneuvering to change orbit. While considering evasion, trajectory constraint, and energy consumption, we formulated the reward function by combining “terminal” and “process” rewards, as well as “strong” and “weak” incentive guidance to reduce pre-exploration difficulty and accelerate convergence of the game network. Additionally, this paper presents a correction factor for LOS angle rate into the double-delay deterministic gradient strategy (TD3), thereby enhancing the sensitivity of high-speed UAVs to changes in LOS rate, as well as the accuracy of evasion timing, which improves the effectiveness and adaptive capability of the intelligent maneuvering strategy. The Monte Carlo simulation results demonstrate that the proposed method achieves a high level of evasion performance—integrating energy optimization with the requisite miss distance for high-speed UAVs—and accomplishes efficient evasion under highly challenging PE game scenarios.
Funder
National Natural Science Foundation of China
Fundamental Research Funds
Reference33 articles.
1. Li, B., Gan, Z., Chen, D., and Sergey Aleksandrovich, D. (2020). UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning. Remote Sens., 12.
2. Optimization of high-speed fixed-wing UAV penetration strategy based on deep reinforcement learning;Zhuang;Aerosp. Sci. Technol.,2024
3. Incremental guidance method for kinetic kill vehicles with target maneuver compensation;Chen;Beijing Hangkong Hangtian Daxue Xuebao/J. Beijing Univ. Aeronaut. Astronaut.,2024
4. Deep Reinforcement Learning with Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System;Li;IEEE Access,2020
5. Optimal maneuver penetration strategy based on power series solution of miss distance;Wang;Beijing Hangkong Hangtian Daxue Xuebao/J. Beijing Univ. Aeronaut. Astronaut.,2020