Polygames: Improved zero learning-Reference-Cited by-同舟云学术

Polygames: Improved zero learning

Published:2021-01-11 Issue:4 Volume:42 Page:244-256
ISSN:2468-2438
Container-title:ICGA Journal
language:
Short-container-title:ICG

Author:

Cazenave Tristan¹,Chen Yen-Chi²,Chen Guan-Wei³,Chen Shi-Yu³,Chiu Xian-Dong³,Dehos Julien⁴,Elsa Maria³,Gong Qucheng⁵,Hu Hengyuan⁵,Khalidov Vasil⁵,Li Cheng-Ling³,Lin Hsin-I³,Lin Yu-Jin³,Martinet Xavier⁵,Mella Vegard⁵,Rapin Jeremy⁵,Roziere Baptiste⁵,Synnaeve Gabriel⁵,Teytaud Fabien⁴,Teytaud Olivier⁵,Ye Shi-Cheng³,Ye Yi-Jun³,Yen Shi-Jim³,Zagoruyko Sergey⁵

Affiliation:

1. LAMSADE, University Paris-Dauphine, PSL, France

2. National Taiwan Normal University, Taiwan

3. AILAB, Dong Hwa University, Taiwan

4. University Littoral Cote d’Opale, France

5. Facebook AI Research, France and United States

Abstract

Since DeepMind’s AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by keeping track of the best checkpoints during the training and by training against them. Using these features, we release Polygames, our framework for Zero learning, with its library of games and its checkpoints. We won against strong humans at the game of Hex in 19 × 19, including the human player with the best ELO rank on LittleGolem; we incidentally also won against another Zero implementation, which was weaker than humans: in a discussion on LittleGolem, Hex19 was said to be intractable for zero learning. We also won in Havannah with size 8: win against the strongest player, namely Eobllor, with excellent opening moves. We also won several first places at the TAAI 2019 competitions and had positive results against strong bots in various games.

Publisher

IOS Press

Subject

Computer Graphics and Computer-Aided Design,Human-Computer Interaction,Computational Mechanics,Computer Science (miscellaneous)

Reference11 articles.

1. The frontier of decidability in partially observable recursive games;Auger;International Journal of Foundations of Computer Science,2012

2. On the complexity of connection games;Bonnet;Theor. Comput. Sci.,2016

3. Buffet, O., Lee, C.-S., Lin, W. & Teytaud, O. (2012). Optimistic heuristics for MineSweeper. In International Computer Symposium, Hualien, Taiwan. https://hal.inria.fr/hal-00750577.

4. Coulom, R. (2007). Efficient selectivity and backup operators in Monte-Carlo tree search. In Proceedings of the 5th International Conference on Computers and Games. CG’06 (pp. 72–83). Berlin, Heidelberg: Springer.

5. Bandit Based Monte-Carlo Planning

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Ludii General Game System for Modeling, Analyzing, and Designing Board Games;Encyclopedia of Computer Graphics and Games;2024

2. Ludii General Game System for Modeling, Analyzing, and Designing Board Games;Encyclopedia of Computer Graphics and Games;2024

3. Spatial state-action features for general games;Artificial Intelligence;2023-08

4. Analyses of Tabular AlphaZero on Strongly-Solved Stochastic Games;IEEE Access;2023

5. Ludii General Game System for Modeling, Analyzing, and Designing Board Games;Encyclopedia of Computer Graphics and Games;2023