A Weighted Mean Field Reinforcement Learning Algorithm for Large-Scale Multi-Agent Collaboration-Reference-Cited by-同舟云学术

A Weighted Mean Field Reinforcement Learning Algorithm for Large-Scale Multi-Agent Collaboration

Published:2023-06 Issue:02 Volume:03 Page:
ISSN:2737-4807
Container-title:Guidance, Navigation and Control
language:en
Short-container-title:Guid. Navigat. Control

Author:

Yuan Xinwei¹^ORCID,Wang He²,Yu Wenwu²

Affiliation:

1. Suzhou Joint Graduate School, Southeast University, Suzhou, Jiangsu Province 215123, P. R. China

2. School of Mathematics, Southeast University, Nanjing, Jiangsu Province 210096, P. R. China

Abstract

Reinforcement learning has been proven to be an effective approach for solving multi-agent coordination problems in a dynamic open environment. For dealing with multi-agent cooperation issues, the mean field multi-agent reinforcement learning method can better overcome the problems of slow learning speed, unstable convergent performance, and poor learning effect. However, the original mean field algorithm cannot extract features well when agents cooperate. In order to solve the large-scale multi-agent coordination problem, in this paper, the mean field multi-agent reinforcement learning algorithm is improved and optimized by combining the multi-head attention mechanism, and the attention-based mean field (MFA) structure is designed. The employment of a multi-head attention mechanism can optimize the interaction among agents, extract more effective cluster features and enable agents to learn more efficient strategies. This paper first introduces the framework structure of MFA and then expounds on the relevant theoretical basis based on the Q-Learning and Actor-Critic algorithms, and finally conducts large-scale multi-agent cooperative experiments on the MAgent platform. The experimental results show that compared with the baseline algorithm, the attention-based mean field Q-learning (MFQA) and attention-based Actor-Critic (MFACA) algorithms can make large-scale multi-agent clusters converge to higher rewards, and perform better than the original mean field multi-agent algorithm.

Funder

National Key R&D Program of China

National Natural Science Foundation of China

Jiangsu Provincial Key Laboratory of Networked Collective Intelligence

Publisher

World Scientific Pub Co Pte Ltd

Subject

General Medicine

Link

https://www.worldscientific.com/doi/pdf/10.1142/S2737480723500073

Reference38 articles.

1. Reinforcement learning in robotic applications: a comprehensive survey

2. A reinforcement learning method for human-robot collaboration in assembly tasks

3. Mastering the game of Stratego with model-free multiagent reinforcement learning

4. Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms

5. Mastering Complex Control in MOBA Games with Deep Reinforcement Learning