Analysis of Hyper-Parameters for AlphaZero-Like Deep Reinforcement Learning-Reference-Cited by-同舟云学术

Analysis of Hyper-Parameters for AlphaZero-Like Deep Reinforcement Learning

Published:2022-09-24 Issue:02 Volume:22 Page:829-853
ISSN:0219-6220
Container-title:International Journal of Information Technology & Decision Making
language:en
Short-container-title:Int. J. Info. Tech. Dec. Mak.

Author:

Wang Hui¹^ORCID,Emmerich Michael¹,Preuss Mike¹,Plaat Aske¹

Affiliation:

1. Universiteit Leiden, Leiden Institute of Advanced Computer Science, Leiden, Netherlands

Abstract

The landmark achievements of AlphaGo Zero have created great research interest into self-play in reinforcement learning. In self-play, Monte Carlo Tree Search (MCTS) is used to train a deep neural network, which is then used itself in tree searches. The training is governed by many hyper-parameters. There has been surprisingly little research on design choices for hyper-parameter values and loss functions, presumably because of the prohibitive computational cost to explore the parameter space. In this paper, we investigate 12 hyper-parameters in an AlphaZero-like self-play algorithm and evaluate how these parameters contribute to training. Through multi-objective analysis, we identify four important hyper-parameters to further assess. To start, we find surprising results where too much training can sometimes lead to lower performance. Our main result is that the number of self-play iterations subsumes MCTS-search simulations, game episodes and training epochs. As a consequence of our experiments, we provide recommendations on setting hyper-parameter values in self-play. The outer loop of self-play iterations should be emphasized, in favor of the inner loop. This means hyper-parameters for the inner loop, should be set to lower values. A secondary result of our experiments concerns the choice of optimization goals, for which we also provide recommendations.

Funder

china scholarship council

Publisher

World Scientific Pub Co Pte Ltd

Subject

General Medicine,Computer Science (miscellaneous)

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219622022500547

Reference40 articles.

1. Mastering the game of Go with deep neural networks and tree search

2. Mastering the game of Go without human knowledge

3. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

4. When doctors meet with AlphaGo: potential application of machine learning to clinical medicine

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Typical lake area is accurately predicted and assessed based on deep learning algorithms and associated physical mechanisms;Earth Science Informatics;2024-03-23

2. Global evapotranspiration simulation research using a coupled deep learning algorithm with physical mechanisms;Irrigation and Drainage;2024-02-29