Abstract
This paper presents a new method, Minimax Tree Optimization (MMTO), to learn a heuristic evaluation function of a practical alpha-beta search program. The evaluation function may be a linear or non-linear combination of weighted features, and the weights are the parameters to be optimized. To control the search results so that the move decisions agree with the game records of human experts, a well-modeled objective function to be minimized is designed. Moreover, a numerical iterative method is used to nd local minima of the objective function, and more than forty million parameters are adjusted by using a small number of hyper parameters. This method was applied to shogi, a major variant of chess in which the evaluation function must handle a larger state space than in chess. Experimental results show that the large-scale optimization of the evaluation function improves the playing strength of shogi programs, and the new method performs signicantly better than other methods. Implementation of the new method in our shogi program Bonanza made substantial contributions to the program's rst-place nish in the 2013 World Computer Shogi Championship. Additionally, we present preliminary evidence of broader applicability of our method to other two-player games such as chess.
Cited by
27 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Contemporary Computer Shogi;Encyclopedia of Computer Graphics and Games;2024
2. Contemporary Computer Shogi;Encyclopedia of Computer Graphics and Games;2023
3. Improving Mini-Shogi Engine Using Self-Play and Possibility of White?s Advantage;J INF SCI ENG;2022
4. Mimicking the Human Approach in the Game of Hive;2021 IEEE Symposium Series on Computational Intelligence (SSCI);2021-12-05
5. Evaluation of Loss Function for Stable Policy Learning in Dobutsu Shogi;2020 International Conference on Technologies and Applications of Artificial Intelligence (TAAI);2020-12