Two-phase selective decentralization to improve reinforcement learning systems with MDP-Reference-Cited by-同舟云学术

Two-phase selective decentralization to improve reinforcement learning systems with MDP

Published:2018-06-11 Issue:4 Volume:31 Page:319-337
ISSN:1875-8452
Container-title:AI Communications
language:
Short-container-title:AIC

Author:

Nguyen Thanh¹,Mukhopadhyay Snehasis¹

Affiliation:

1. Department of Computer and Information Science, Indiana University Purdue University Indianapolis, 723 W Michigan St SL 280 Indianapolis, Indiana 46202, United States. E-mails: thamnguy@iupui.edu, smukhopa@iupui.edu

Publisher

IOS Press

Subject

Artificial Intelligence

Reference69 articles.

1. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach;Abu-Khalaf;Automatica,2005

2. W.F. Arnold III. and A.J. Laub, Generalized eigenproblem algorithms and software for algebraic Riccati equations, in: Proceedings of the IEEE, Vol. 72, 1984, pp. 1746–1754.

3. G. Arslan and S. Yüksel, Decentralized Q-learning for weakly acyclic stochastic dynamic games, in: IEEE Conference on Decision and Control, 2015, pp. 6743–6748.

4. Decentralized Q-learning for stochastic teams and games;Arslan;IEEE Transactions on Automatic Control,2017

5. Temporal difference learning;Barto;Scholarpedia,2007

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Influence of Psychological Factors on Evacuation in Fire Scene Based on SPSS Data Analysis;Cyber Security Intelligence and Analytics;2022

2. Why the ‘selfish’ optimizing agents could solve the decentralized reinforcement learning problems;AI Communications;2019-05-16