Affiliation:
1. Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
2. Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, NJ 07764, USA
Abstract
Multiple unmanned aerial vehicles (Multi-UAV) systems have recently demonstrated significant advantages in some real-world scenarios, but the limited communication range of UAVs poses great challenges to multi-UAV collaborative decision-making. By constructing the multi-UAV cooperation problem as a multi-agent system (MAS), the cooperative decision-making among UAVs can be realized by using multi-agent reinforcement learning (MARL). Following this paradigm, this work focuses on developing partially observable MARL models that capture important information from local observations in order to select effective actions. Previous related studies employ either probability distributions or weighted mean field to update the average actions of neighborhood agents. However, they do not fully consider the feature information of surrounding neighbors, resulting in a local optimum often. In this paper, we propose a novel partially multi-agent reinforcement learning algorithm to remedy this flaw, which is based on graph attention network and partially observable mean field and is named as the GPMF algorithm for short. GPMF uses a graph attention module and a mean field module to describe how an agent is influenced by the actions of other agents at each time step. The graph attention module consists of a graph attention encoder and a differentiable attention mechanism, outputting a dynamic graph to represent the effectiveness of neighborhood agents against central agents. The mean field module approximates the effect of a neighborhood agent on a central agent as the average effect of effective neighborhood agents. Aiming at the typical task scenario of large-scale multi-UAV cooperative roundup, the proposed algorithm is evaluated based on the MAgent framework. Experimental results show that GPMF outperforms baselines including state-of-the-art partially observable mean field reinforcement learning algorithms, providing technical support for large-scale multi-UAV coordination and confrontation tasks in communication-constrained environments.
Funder
Shanghai Science and Technology Committee
National Nature Science Foundation of China
Fundamental Research Funds for the Central Universities
Subject
Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering
Reference48 articles.
1. Frattolillo, F., Brunori, D., and Iocchi, L. (2023). Scalable and Cooperative Deep Reinforcement Learning Approaches for Multi-UAV Systems: A Systematic Review. Drones, 7.
2. Distributed sliding mode control for time-varying formation tracking of multi-UAV system with a dynamic leader;Wang;Aerosp. Sci. Technol.,2021
3. MARL Sim2real Transfer: Merging Physical Reality With Digital Virtuality in Metaverse;Shi;IEEE Trans. Syst. Man Cybern. Syst.,2022
4. Co-TS: Design and Implementation of a 2-UAV Cooperative Transportation System;Weng;Int. J. Micro Air Veh.,2023
5. Lightweight unmanned aerial vehicle video object detection based on spatial-temporal correlation;Zhou;Int. J. Commun. Syst.,2022
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献