Hybrid knowledge transfer for MARL based on action advising and experience sharing-Reference-Cited by-同舟云学术

Hybrid knowledge transfer for MARL based on action advising and experience sharing

Published:2024-05-07 Issue: Volume:18 Page:
ISSN:1662-5218
Container-title:Frontiers in Neurorobotics
language:
Short-container-title:Front. Neurorobot.

Author:

Liu Feng,Li Dongqi,Gao Jian

Abstract

Multiagent Reinforcement Learning (MARL) has been well adopted due to its exceptional ability to solve multiagent decision-making problems. To further enhance learning efficiency, knowledge transfer algorithms have been developed, among which experience-sharing-based and action-advising-based transfer strategies share the mainstream. However, it is notable that, although there exist many successful applications of both strategies, they are not flawless. For the long-developed action-advising-based methods (namely KT-AA, short for knowledge transfer based on action advising), their data efficiency and scalability are not satisfactory. As for the newly proposed experience-sharing-based knowledge transfer methods (KT-ES), although the shortcomings of KT-AA have been partially overcome, they are incompetent to correct specific bad decisions in the later learning stage. To leverage the superiority of both KT-AA and KT-ES, this study proposes KT-Hybrid, a hybrid knowledge transfer approach. In the early learning phase, KT-ES methods are employed, expecting better data efficiency from KT-ES to enhance the policy to a basic level as soon as possible. Later, we focus on correcting specific errors made by the basic policy, trying to use KT-AA methods to further improve the performance. Simulations demonstrate that the proposed KT-Hybrid outperforms well-received action-advising- and experience-sharing-based methods.

Publisher

Frontiers Media SA

Reference29 articles.

1. “Interactive teaching strategies for agent training,”;Amir,2016

2. Reinforcement learning for selective key applications in power systems: recent advances and future challenges;Chen;IEEE Trans. Smart Grid,2022

3. Interactive policy learning through confidence-based autonomy;Chernova;J. Artif. Intell. Res,2009

4. Magnetic control of tokamak plasmas through deep reinforcement learning;Degrave;Nature,2022