Abstract
This article proposes a novel Mahjong game model, LsAc ∗‐MJ, designed to address challenges posed by data scarcity, difficulty in leveraging contextual information, and the computational resource‐intensive nature of self‐play zero‐shot learning. The model is applied to Japanese Mahjong for experiments. LsAc ∗‐MJ employs long short‐term memory (LSTM) neural networks, utilizing hidden nodes to store and propagate contextual historical information, thereby enhancing decision accuracy. Additionally, the paper introduces an optimized Advantage Actor‐Critic (A2C) algorithm incorporating an experience replay mechanism to enhance the model’s decision‐making capabilities and mitigate convergence difficulties arising from strong data correlations. Furthermore, the paper presents a two‐stage training approach for self‐play deep reinforcement learning models guided by expert knowledge, thereby improving training efficiency. Extensive ablation experiments and performance comparisons demonstrate that, in contrast to other typical deep reinforcement learning models on the RLcard platform, the LsAc ∗‐MJ model consumes lower computational and time resources, has higher training efficiency, faster average decision time, higher win‐rate, and stronger decision‐making ability.
Funder
National Natural Science Foundation of China
National Social Science Fund of China
Reference45 articles.
1. LiJ. KoyamadaS. YeQ. LiuG. WangC. YangR. ZhaoL. QinT. LiuT. andHonH. Suphx: mastering mahjong with deep reinforcement learning 2020 https://arxiv.org/abs/2003.13590.
2. Tencent Ai Lab Decision-making AI new breakthrough Tencent AI Lab LuckyJ top of the international mahjong platform https://mp.weixin.qq.com/s/KF0nPfbPKJeRztZ1wujBHA.
3. WuY.andTianY. Training agent for first-person shooter game with actor-critic curriculum learning Proceedings of the International Conference On Learning Representations April 2017 Toulon France.
4. MnihV. BadiaA. P. MirzaM. GravesA. LillicrapT. HarleyT. SilverD. andKavukcuogluK. Asynchronous methods for deep reinforcement learning Proceedings of the International Conference on Machine Learning June 2016 New York NY USA.
5. MnihV. KavukcuogluK. SilverD. GravesA. AntonoglouI. WierstraD. andRiedmillerM. Playing atari with deep reinforcement learning 2013 https://arxiv.org/abs/1312.5602.