Solving large-scale multi-agent tasks via transfer learning with dynamic state representation-Reference-Cited by-同舟云学术

Solving large-scale multi-agent tasks via transfer learning with dynamic state representation

Published:2023-03-01 Issue:2 Volume:20 Page:172988062311624
ISSN:1729-8806
Container-title:International Journal of Advanced Robotic Systems
language:en
Short-container-title:International Journal of Advanced Robotic Systems

Author:

Dou Lintao¹,Jia Zhen²^ORCID,Huang Jian¹

Affiliation:

1. Academy of Intelligent Sciences, National University of Defense Technology, Changsha, China

2. Jiangsu Automation Research Institute, Lianyungang, China

Abstract

Many research results have emerged in the past decade regarding multi-agent reinforcement learning. These include the successful application of asynchronous advantage actor-critic, double deep Q-network and other algorithms in multi-agent environments, and the more representative multi-agent training method based on the classical centralized training distributed execution algorithm QMIX. However, in a large-scale multi-agent environment, training becomes a major challenge due to the exponential growth of the state-action space. In this article, we design a training scheme from small-scale multi-agent training to large-scale multi-agent training. We use the transfer learning method to enable the training of large-scale agents to use the knowledge accumulated by training small-scale agents. We achieve policy transfer between tasks with different numbers of agents by designing a new dynamic state representation network, which uses a self-attention mechanism to capture and represent the local observations of agents. The dynamic state representation network makes it possible to expand the policy model from a few agents (4 agents, 10 agents) task to large-scale agents (16 agents, 50 agents) task. Furthermore, we conducted experiments in the famous real-time strategy game Starcraft II and the multi-agent research platform MAgent. And also set unmanned aerial vehicles trajectory planning simulations. Experimental results show that our approach not only reduces the time consumption of a large number of agent training tasks but also improves the final training performance.

Publisher

SAGE Publications

Subject

Artificial Intelligence,Computer Science Applications,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/17298806231162440

Reference44 articles.

1. Human-level control through deep reinforcement learning

2. Multi-agent Reinforcement Learning: An Overview

3. Peng P, Yuan Q, Wen Y, et al. Multiagent bidirectionally-coordinated nets for learning to play StarCraft combat games. 2017. CoRR, abs/1703.10069.