PPO-Exp: Keeping Fixed-Wing UAV Formation with Deep Reinforcement Learning-Reference-Cited by-同舟云学术

PPO-Exp: Keeping Fixed-Wing UAV Formation with Deep Reinforcement Learning

Published:2022-12-31 Issue:1 Volume:7 Page:28
ISSN:2504-446X
Container-title:Drones
language:en
Short-container-title:Drones

Author:

Xu Dan,Guo Yunxiao^ORCID,Yu Zhongyi,Wang Zhenfeng,Lan Rongze,Zhao Runhao,Xie Xinjia^ORCID,Long Han

Abstract

Flocking for fixed-Wing Unmanned Aerial Vehicles (UAVs) is an extremely complex challenge due to fixed-wing UAV’s control problem and the system’s coordinate difficulty. Recently, flocking approaches based on reinforcement learning have attracted attention. However, current methods also require that each UAV makes the decision decentralized, which increases the cost and computation of the whole UAV system. This paper researches a low-cost UAV formation system consisting of one leader (equipped with the intelligence chip) with five followers (without the intelligence chip), and proposes a centralized collision-free formation-keeping method. The communication in the whole process is considered and the protocol is designed by minimizing the communication cost. In addition, an analysis of the Proximal Policy Optimization (PPO) algorithm is provided; the paper derives the estimation error bound, and reveals the relationship between the bound and exploration. To encourage the agent to balance their exploration and estimation error bound, a version of PPO named PPO-Exploration (PPO-Exp) is proposed. It can adjust the clip constraint parameter and make the exploration mechanism more flexible. The results of the experiments show that PPO-Exp performs better than the current algorithms in these tasks.

Funder

Dan Xu

Publisher

MDPI AG

Subject

Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering

Link

https://www.mdpi.com/2504-446X/7/1/28/pdf

Reference47 articles.

1. Zhou, W., Li, J., and Zhang, Q. (2022). Joint Communication and Action Learning in Multi-Target Tracking of UAV Swarms with Deep Reinforcement Learning. Drones, 6.

2. Tian, S., Wen, X., Wei, B., and Wu, G. (2022). Cooperatively Routing a Truck and Multiple Drones for Target Surveillance. Sensors, 22.

3. Wu, G., Fan, M., Shi, J., and Feng, Y. (2021). Reinforcement Learning based Truck-and-Drone Coordinated Delivery. IEEE Trans. Artif. Intell.