Improved Feature Learning: A Maximum-Average-Out Deep Neural Network for the Game Go-Reference-Cited by-同舟云学术

Improved Feature Learning: A Maximum-Average-Out Deep Neural Network for the Game Go

Published:2020-04-09 Issue: Volume:2020 Page:1-6
ISSN:1024-123X
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Li Xiali¹^ORCID,Lv Zhengyu¹^ORCID,Liu Bo¹^ORCID,Wu Licheng¹,Wang Zheng²

Affiliation:

1. School of Information Engineering, Minzu University of China, Beijing 100081, China

2. Department of Management, Taiyuan Normal University, Shanxi 030619, China

Abstract

Computer game-playing programs based on deep reinforcement learning have surpassed the performance of even the best human players. However, the huge analysis space of such neural networks and their numerous parameters require extensive computing power. Hence, in this study, we aimed to increase the network learning efficiency by modifying the neural network structure, which should reduce the number of learning iterations and the required computing power. A convolutional neural network with a maximum-average-out (MAO) unit structure based on piecewise function thinking is proposed, through which features can be effectively learned and the expression ability of hidden layer features can be enhanced. To verify the performance of the MAO structure, we compared it with the ResNet18 network by applying them both to the framework of AlphaGo Zero, which was developed for playing the game Go. The two network structures were trained from scratch using a low-cost server environment. MAO unit won eight out of ten games against the ResNet18 network. The superior performance of the MAO unit compared with the ResNet18 network is significant for the further development of game algorithms that require less computing power than those currently in use.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2020/1397948.pdf

Reference15 articles.

1. Learning to predict by the methods of temporal differences

2. Practical Issues in Temporal Difference Learning

3. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play