Affiliation:
1. School of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China
Abstract
In this paper, we propose a multiphase semistatic training method for swarm confrontation using multi-agent deep reinforcement learning. In particular, we build a swarm confrontation game, the 3V3 tank fight, based on the Unity platform and train the agents by a MDRL algorithm called MA-POCA, coming with the ML-Agent toolkit. By multiphase learning, we split the traditional single training phase into multiple consecutive training phases, where the performance level of the strong team for each phase increases in an incremental way. On the other hand, by semistatic learning, the strong team in all phases will stop learning when fighting against the weak team, which reduces the possibility that the weak team keeps being defeated and learns nothing at all. Comprehensive experiments prove that, in contrast to the traditional single-phase training method, the multiphase semistatic training method proposed in this paper can significantly increase the training efficiency, shedding lights on how the weak could learn from the strong with less time and computational cost.
Funder
National Natural Science Foundation of China
Subject
General Mathematics,General Medicine,General Neuroscience,General Computer Science