Sequence-to-Sequence Multi-Agent Reinforcement Learning for Multi-UAV Task Planning in 3D Dynamic Environment-Reference-Cited by-同舟云学术

Sequence-to-Sequence Multi-Agent Reinforcement Learning for Multi-UAV Task Planning in 3D Dynamic Environment

Published:2022-11-28 Issue:23 Volume:12 Page:12181
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Liu Ziwei^ORCID,Qiu Changzhen,Zhang Zhiyong

Abstract

Task planning involving multiple unmanned aerial vehicles (UAVs) is one of the main research topics in the field of cooperative unmanned aerial vehicle control systems. This is a complex optimization problem where task allocation and path planning are dealt with separately. However, the recalculation of optimal results is too slow for real-time operations in dynamic environments due to a large amount of computation required, and traditional algorithms are difficult to handle scenarios of varying scales. Meanwhile, the traditional approach confines task planning to a 2D environment, which deviates from the real world. In this paper, we design a 3D dynamic environment and propose a method for task planning based on sequence-to-sequence multi-agent deep deterministic policy gradient (SMADDPG) algorithm. First, we construct the task-planning problem as a multi-agent system based on the Markov decision process. Then, the DDPG is combined sequence-to-sequence to learn the system to solve task assignment and path planning simultaneously according to the corresponding reward function. We compare our approach with the traditional reinforcement learning algorithm in this system. The simulation results show that our approach satisfies the task-planning requirements and can accomplish tasks more efficiently in competitive as well as cooperative scenarios with dynamic or constant scales.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/23/12181/pdf

Reference50 articles.

1. Robust tracking control of a quadrotor UAV based on adaptive sliding mode controller;Huang;Complexity,2019

2. Task scheduling system for UAV operations in agricultural plant protection environment;Sun;J. Ambient Intell. Humaniz. Comput.,2020

3. Automatic safety routing inspection of the electric circuits based on UAV light detection and ranging;Meng;Destech Trans. Eng. Technol. Res.,2017

4. Scherer, J., and Rinner, B. (2016, January 21–25). Persistent multi-UAV surveillance with energy and communication constraints. Proceedings of the IEEE International Conference on Automation Science and Engineering, Fort Worth, TX, USA.

5. Chen, X.Y., Nan, Y., and Yang, Y. (2019). Multi-UAV reconnaissance task assignment for heterogeneous targets based on modified symbiotic organism search algorithm. Sensors, 19.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-UAV roundup strategy method based on deep reinforcement learning CEL-MADDPG algorithm;Expert Systems with Applications;2024-07

2. Cooperative task allocation for multi heterogeneous aerial vehicles using particle swarm optimization algorithm and entropy weight method;Applied Soft Computing;2023-11