A Novel Heterogeneous Swarm Reinforcement Learning Method for Sequential Decision Making Problems-Reference-Cited by-同舟云学术

A Novel Heterogeneous Swarm Reinforcement Learning Method for Sequential Decision Making Problems

Published:2019-04-16 Issue:2 Volume:1 Page:590-610
ISSN:2504-4990
Container-title:Machine Learning and Knowledge Extraction
language:en
Short-container-title:MAKE

Author:

Akbari Zohreh^ORCID,Unland Rainer

Abstract

Sequential Decision Making Problems (SDMPs) that can be modeled as Markov Decision Processes can be solved using methods that combine Dynamic Programming (DP) and Reinforcement Learning (RL). Depending on the problem scenarios and the available Decision Makers (DMs), such RL algorithms may be designed for single-agent systems or multi-agent systems that either consist of agents with individual goals and decision making capabilities, which are influenced by other agent’s decisions, or behave as a swarm of agents that collaboratively learn a single objective. Many studies have been conducted in this area; however, when concentrating on available swarm RL algorithms, one obtains a clear view of the areas that still require attention. Most of the studies in this area focus on homogeneous swarms and so far, systems introduced as Heterogeneous Swarms (HetSs) merely include very few, i.e., two or three sub-swarms of homogeneous agents, which either, according to their capabilities, deal with a specific sub-problem of the general problem or exhibit different behaviors in order to reduce the risk of bias. This study introduces a novel approach that allows agents, which are originally designed to solve different problems and hence have higher degrees of heterogeneity, to behave as a swarm when addressing identical sub-problems. In fact, the affinity between two agents, which measures the compatibility of agents to work together towards solving a specific sub-problem, is used in designing a Heterogeneous Swarm RL (HetSRL) algorithm that allows HetSs to solve the intended SDMPs.

Publisher

MDPI AG

Subject

General Economics, Econometrics and Finance

Link

https://www.mdpi.com/2504-4990/1/2/35/pdf

Reference63 articles.

1. Markov Decision Processes: Discrete Stochastic Dynamic Programming;Puterman,1994

2. Swarm Intelligence in Cellular Robotic Systems

3. Swarmanoid: A Novel Concept for the Study of Heterogeneous Robotic Swarms

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Continuous Control With Swarm Intelligence Based Value Function Approximation;IEEE Transactions on Automation Science and Engineering;2024-01

2. Character-Based Value Factorization For MADRL;The Computer Journal;2022-09-19

3. A Powerful Holonic and Multi-Agent-Based Front-End for Medical Diagnostics Systems;Handbook of Artificial Intelligence in Healthcare;2021-09-18

4. AI Meets CRNs: A Prospective Review on the Application of Deep Architectures in Spectrum Management;IEEE Access;2021