A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning-Reference-Cited by-同舟云学术

A distributed adaptive policy gradient method based on momentum for multi-agent reinforcement learning

Published:2024-07-12 Issue:5 Volume:10 Page:7297-7310
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Shi Junru,Wang Xin^ORCID,Zhang Mingchuan,Liu Muhua,Zhu Junlong,Wu Qingtao

Abstract

AbstractPolicy Gradient (PG) method is one of the most popular algorithms in Reinforcement Learning (RL). However, distributed adaptive variants of PG are rarely studied in multi-agent. For this reason, this paper proposes a distributed adaptive policy gradient algorithm (IS-DAPGM) incorporated with Adam-type updates and importance sampling technique. Furthermore, we also establish the theoretical convergence rate of

$$\mathcal {O}(1/\sqrt{T})$$

O ( 1 / T ) , where T represents the number of iterations, it can match the convergence rate of the state-of-the-art centralized policy gradient methods. In addition, many experiments are conducted in a multi-agent environment, which is a modification on the basis of Particle world environment. By comparing with some other distributed PG methods and changing the number of agents, we verify the performance of IS-DAPGM is more efficient than the existing methods.

Funder

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s40747-024-01529-6.pdf

Reference59 articles.

1. Alenizi J, Alrashdi I (2023) Sfmr-sh: secure framework for mitigating ransomware attacks in smart healthcare using blockchain technology. Sustain Mach Intell J 2, pp 1–19

2. Tao H, Qiu J, Chen Y, Stojanovic V, Cheng L (2023) Unsupervised cross-domain rolling bearing fault diagnosis based on time-frequency information fusion. Frankl Inst 360:1454–1477

3. Wang WY, Li J, He X (2018) Deep reinforcement learning for NLP. In: Proceedings of the 56th annual meeting of association for computational linguistics. ACL, pp 19–21

4. Andre E, Alexandre R, Bharath R, Volodymyr K, Mark D, Chou K, Cui C, Greg C, Sebastian T, Jeff D (2019) A guide to deep learning in healthcare. Nat Med 25(1):24–29

5. Ahmed E, Lu S, Ahmad A, Alber A (2023) Assessment the health sustainability using neutrosophic mcdm methodology: case study covid-19. Sustain Mach Intell J 3, pp 1–10