Affiliation:
1. School of Information and Engineering, Minzu University of China, Beijing 100081, China
Abstract
In this study, hybrid state-action-reward-state-action (SARSAλ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSAλ and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game program’s learning efficiency and understanding ability for Tibetan Jiu chess.
Funder
National Natural Science Foundation of China
Subject
Multidisciplinary,General Computer Science
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Nested Three-Stage Game Algorithm Based on Chess Shape Evaluation for Tibetan Jiu Chess;2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC);2024-07-02
2. Tibetan Jiu Chess Intelligent Game Platform;Communications in Computer and Information Science;2024
3. A phased game algorithm combining deep reinforcement learning and UCT for Tibetan Jiu chess;2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC);2023-06
4. The Survey of Self-play Method in Computer Games;Cognitive Computation and Systems;2023