Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph Attention Network for UAV Swarms-Reference-Cited by-同舟云学术

Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph Attention Network for UAV Swarms

Published:2023-07-20 Issue:7 Volume:7 Page:476
ISSN:2504-446X
Container-title:Drones
language:en
Short-container-title:Drones

Author:

Yang Min¹,Liu Guanjun¹^ORCID,Zhou Ziyuan¹^ORCID,Wang Jiacun²^ORCID

Affiliation:

1. Department of Computer Science and Technology, Tongji University, Shanghai 201804, China

2. Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, NJ 07764, USA

Abstract

Multiple unmanned aerial vehicles (Multi-UAV) systems have recently demonstrated significant advantages in some real-world scenarios, but the limited communication range of UAVs poses great challenges to multi-UAV collaborative decision-making. By constructing the multi-UAV cooperation problem as a multi-agent system (MAS), the cooperative decision-making among UAVs can be realized by using multi-agent reinforcement learning (MARL). Following this paradigm, this work focuses on developing partially observable MARL models that capture important information from local observations in order to select effective actions. Previous related studies employ either probability distributions or weighted mean field to update the average actions of neighborhood agents. However, they do not fully consider the feature information of surrounding neighbors, resulting in a local optimum often. In this paper, we propose a novel partially multi-agent reinforcement learning algorithm to remedy this flaw, which is based on graph attention network and partially observable mean field and is named as the GPMF algorithm for short. GPMF uses a graph attention module and a mean field module to describe how an agent is influenced by the actions of other agents at each time step. The graph attention module consists of a graph attention encoder and a differentiable attention mechanism, outputting a dynamic graph to represent the effectiveness of neighborhood agents against central agents. The mean field module approximates the effect of a neighborhood agent on a central agent as the average effect of effective neighborhood agents. Aiming at the typical task scenario of large-scale multi-UAV cooperative roundup, the proposed algorithm is evaluated based on the MAgent framework. Experimental results show that GPMF outperforms baselines including state-of-the-art partially observable mean field reinforcement learning algorithms, providing technical support for large-scale multi-UAV coordination and confrontation tasks in communication-constrained environments.

Funder

Shanghai Science and Technology Committee

National Nature Science Foundation of China

Fundamental Research Funds for the Central Universities

Publisher

MDPI AG

Subject

Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering

Link

https://www.mdpi.com/2504-446X/7/7/476/pdf

Reference48 articles.

1. Frattolillo, F., Brunori, D., and Iocchi, L. (2023). Scalable and Cooperative Deep Reinforcement Learning Approaches for Multi-UAV Systems: A Systematic Review. Drones, 7.

2. Distributed sliding mode control for time-varying formation tracking of multi-UAV system with a dynamic leader;Wang;Aerosp. Sci. Technol.,2021

3. MARL Sim2real Transfer: Merging Physical Reality With Digital Virtuality in Metaverse;Shi;IEEE Trans. Syst. Man Cybern. Syst.,2022

4. Co-TS: Design and Implementation of a 2-UAV Cooperative Transportation System;Weng;Int. J. Micro Air Veh.,2023

5. Lightweight unmanned aerial vehicle video object detection based on spatial-temporal correlation;Zhou;Int. J. Commun. Syst.,2022

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Revolutionizing RPAS logistics and reducing CO2 emissions with advanced RPAS technology for delivery systems;Cleaner Logistics and Supply Chain;2024-09

2. Comprehensive Review of Drones Collision Avoidance Schemes: Challenges and Open Issues;IEEE Transactions on Intelligent Transportation Systems;2024-07

3. Truck-Drone Delivery Optimization Based on Multi-Agent Reinforcement Learning;Drones;2024-01-20

4. Learning to Detect Critical Nodes in Sparse Graphs via Feature Importance Awareness;IEEE Transactions on Automation Science and Engineering;2024

5. Robust Multi-Agent Reinforcement Learning Method Based on Adversarial Domain Randomization for Real-World Dual-UAV Cooperation;IEEE Transactions on Intelligent Vehicles;2023