Abstract
In this paper, a novel deep reinforcement learning (DRL) method, and robust deep deterministic policy gradient (Robust-DDPG), is proposed for developing a controller that allows robust flying of an unmanned aerial vehicle (UAV) in dynamic uncertain environments. This technique is applicable in many fields, such as penetration and remote surveillance. The learning-based controller is constructed with an actor-critic framework, and can perform a dual-channel continuous control (roll and speed) of the UAV. To overcome the fragility and volatility of original DDPG, three critical learning tricks are introduced in Robust-DDPG: (1) Delayed-learning trick, providing stable learnings, while facing dynamic environments; (2) adversarial attack trick, improving policy’s adaptability to uncertain environments; (3) mixed exploration trick, enabling faster convergence of the model. The training experiments show great improvement in its convergence speed, convergence effect, and stability. The exploiting experiments demonstrate high efficiency in providing the UAV a shorter and smoother path. While, the generalization experiments verify its better adaptability to complicated, dynamic and uncertain environments, comparing to Deep Q Network (DQN) and DDPG algorithms.
Funder
National Natural Science Foundation of China
Aeronautical Science Foundation of China
Subject
General Earth and Planetary Sciences
Cited by
68 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献