Author:
You Yang,Li Liangwei,Guo Baisong,Wang Weiming,Lu Cewu
Abstract
Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels. However, those games are assumed to have a small fixed number of actions and could be trained with a simple CNN network. In this paper, we study a special class of Asian popular card games called Dou Di Zhu, in which two adversarial groups of agents must consider numerous card combinations at each time step, leading to huge number of actions. We propose a novel method to handle combinatorial actions, which we call combinatorial Q-learning (CQL). We employ a two-stage network to reduce action space and also leverage order-invariant max-pooling operations to extract relationships between primitive actions. Results show that our method prevails over other baseline learning algorithms like naive Q-learning and A3C. We develop an easy-to-use card game environments and train all agents adversarially from sractch, with only knowledge of game rules and verify that our agents are comparative to humans. Our code to reproduce all reported results is available on github.com/qq456cvb/doudizhu-C.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Research on DouDiZhu Model Based on Deep Reinforcement Learning;2023 7th Asian Conference on Artificial Intelligence Technology (ACAIT);2023-11-10
2. DouRN: Improving DouZero by Residual Neural Networks;2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC);2023-11-02
3. DanZero: Mastering GuanDan Game with Reinforcement Learning;2023 IEEE Conference on Games (CoG);2023-08-21
4. JP-DouZero: an enhanced DouDiZhu AI based on reinforcement learning with peasant collaboration and intrinsic rewards;2023 9th International Conference on Big Data Computing and Communications (BigCom);2023-08-04
5. Deep Reinforcement Learning for Two-Player DouDizhu;2022 Euro-Asia Conference on Frontiers of Computer Science and Information Technology (FCSIT);2022-12