Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games-Reference-Cited by-同舟云学术

Limiting Dynamics for Q-Learning with Memory One in Symmetric Two-Player, Two-Action Games

Published:2022-11-08 Issue: Volume:2022 Page:1-20
ISSN:1099-0526
Container-title:Complexity
language:en
Short-container-title:Complexity

Author:

Meylahn J. M.¹²^ORCID,Janssen L.³^ORCID

Affiliation:

1. Department of Applied Mathematics, University of Twente, Enschede, Netherlands

2. Dutch Institute of Emergent Phenomena, University of Amsterdam, Amsterdam, Netherlands

3. Faculty of Science, University of Amsterdam, Amsterdam, Netherlands

Abstract

We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner’s dilemma, stag hunt, and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.

Funder

University of Amsterdam

Publisher

Hindawi Limited

Subject

Multidisciplinary,General Computer Science

Link

http://downloads.hindawi.com/journals/complexity/2022/4830491.pdf

Reference61 articles.

1. A Comprehensive Survey of Multiagent Reinforcement Learning

2. An overview of multi-agent reinforcement learning from game theoretical perspective;Y. Yang,2020

3. Multi-agent reinforcement learning: an overview;L. Buşoniu;Innovations in multi-agent systems and applications-1,2010