UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient-Reference-Cited by-同舟云学术

UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient

Published:2022-03-14 Issue: Volume:2022 Page:1-12
ISSN:1530-8677
Container-title:Wireless Communications and Mobile Computing
language:en
Short-container-title:Wireless Communications and Mobile Computing

Author:

Wu Runjia¹,Gu Fangqing¹^ORCID,Liu Hai-lin¹,Shi Hongjian²

Affiliation:

1. School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou, China

2. Beijing Normal University-Hong Kong, Baptist University United International College, Zhuhai, China

Abstract

Deep deterministic policy gradient (DDPG) algorithm is a reinforcement learning method, which has been widely used in UAV path planning. However, the critic network of DDPG is frequently updated in the training process. It leads to an inevitable overestimation problem and increases the training computational complexity. Therefore, this paper presents a multicritic-delayed DDPG method for solving the UAV path planning. It uses multicritic networks and delayed learning methods to reduce the overestimation problem of DDPG and adds noise to improve the robustness in the real environment. Moreover, a UAV mission platform is built to train and evaluate the effectiveness and robustness of the proposed method. Simulation results show that the proposed algorithm has a higher convergence speed, a better convergence effect, and stability. It indicates that UAV can learn more knowledge from the complex environment.

Funder

Programme of Science and Technology of Guangdong Province

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems

Link

http://downloads.hindawi.com/journals/wcmc/2022/9017079.pdf

Reference43 articles.

1. Toward a Fully Autonomous UAV: Research Platform for Indoor and Outdoor Urban Search and Rescue

2. UAV Air Combat Autonomous Maneuver Decision Based on DDPG Algorithm

3. An Active Disturbance Rejection Approach to Leader-Follower Controlled Formation

4. Small unmanned aerial vehicle (UAV) real-time intelligence, surveillance and reconnaissance (ISR) using onboard pre-processing;R. Stevens;Proceedings of SPIE,2008

5. Deep Reinforcement Learning-Based Content Placement and Trajectory Design in Urban Cache-Enabled UAV Networks

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Intelligent path planning of mobile robot based on Deep Deterministic Policy Gradient;2022-12-06

2. Multi-agent reinforcement learning based 5G bi-level multi-slice resource allocation;2022 18th International Conference on Computational Intelligence and Security (CIS);2022-12

3. Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms;Computational Intelligence and Neuroscience;2022-11-18