Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments-Reference-Cited by-同舟云学术

Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments

Published:2023-11-25 Issue:23 Volume:15 Page:5494
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Li Jiantao¹^ORCID,Zhang Tianxian¹^ORCID,Liu Kai¹^ORCID

Affiliation:

1. School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

Abstract

Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.

Funder

National Natural Science Foundation of China

GF Science and Technology Special Innovation Zone Project

Fundamental Research Funds of Central Universities

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/15/23/5494/pdf

Reference42 articles.

1. Xu, H., Fang, G., Fan, Y., Xu, B., and Yan, J. (2020). Universal adaptive neural network predictive algorithm for remotely piloted unmanned combat aerial vehicle in wireless sensor network. Sensors, 20.

2. Zhang, T.X., Wang, Y.H., Ma, Z.J., and Kong, L.J. (IEEE Trans. Aerosp. Electron. Syst., 2023). Task assignment in UAV-enabled front jammer swarm: A coalition formation game approach, IEEE Trans. Aerosp. Electron. Syst., early access.

3. Grey wolf optimizer for unmanned combat aerial vehicle path planning;Zhang;Adv. Eng. Softw.,2016

4. Kabamba, P.T., Meerkov, S.M., and Zeitz, F.H. (2005, January 3–8). Optimal UCAV path planning under missile threats. Proceedings of the 16th International Federation of Automatic Control World Congress (IFAC), Prague, Czech Republic.

5. Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge;Singla;IEEE Trans. Intell. Transp. Syst.,2021