A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms-Reference-Cited by-同舟云学术

A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms

Published:1999-11-01 Issue:8 Volume:11 Page:2017-2060
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Szepesvári Csaba¹,Littman Michael L.²

Affiliation:

1. Mindmaker, Ltd., Budapest 1121, Konkoly Thege M. U. 29-33, Hungary

2. Department of Computer Science, Duke University, Durham, NC 27708-0129, U.S.A.

Abstract

Reinforcement learning is the problem of generating optimal behavior in a sequential decision-making environment given the opportunity of interacting with it. Many algorithms for solving reinforcement-learning problems work by computing improved estimates of the optimal value function. We extend prior analyses of reinforcement-learning algorithms and present a powerful new theorem that can provide a unified analysis of such value-function-based reinforcement-learning algorithms. The usefulness of the theorem lies in how it allows the convergence of a complex asynchronous reinforcement-learning algorithm to be proved by verifying that a simpler synchronous algorithm converges. We illustrate the application of the theorem by analyzing the convergence of Q-learning, model-based reinforcement learning, Q-learning with multistate updates, Q-learning for Markov games, and risk-sensitive reinforcement learning.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/089976699300016070

Reference12 articles.

1. Learning to act using real-time dynamic programming

2. Adaptive aggregation methods for infinite horizon dynamic programming

3. Embedding fields: A theory of learning with physiological implications

4. On the Convergence of Stochastic Iterative Dynamic Programming Algorithms

5. Reinforcement Learning: A Survey

Cited by 112 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reinforcement learning with predefined and inferred reward machines in stochastic games;Neurocomputing;2024-09

2. Dynamic Pricing for Vehicle Dispatching in Mobility-as-a-Service Market via Multi-Agent Deep Reinforcement Learning;IEEE Transactions on Vehicular Technology;2024-08

3. Transformer-Based Reinforcement Learning for Scalable Multi-UAV Area Coverage;IEEE Transactions on Intelligent Transportation Systems;2024-08

4. Markov game for CV joint adaptive routing in stochastic traffic networks: A scalable learning approach;Transportation Research Part B: Methodological;2024-06

5. Recovery from Adversarial Attacks in Cyber-physical Systems: Shallow, Deep, and Exploratory Works;ACM Computing Surveys;2024-04-26