Exploitation-Oriented Learning with Deep Learning – Introducing Profit Sharing to a Deep Q-Network

Exploitation-Oriented Learning with Deep Learning – Introducing Profit Sharing to a Deep Q-Network –

Published:2017-09-20 Issue:5 Volume:21 Page:849-855
ISSN:1883-8014
Container-title:Journal of Advanced Computational Intelligence and Intelligent Informatics
language:en
Short-container-title:JACIII

Author:

Miyazaki Kazuteru,

Abstract

Currently, deep learning is attracting significant interest. Combining deep Q-networks (DQNs) and Q-learning has produced excellent results for several Atari 2600 games. In this paper, we propose an exploitation-oriented learning (XoL) method that incorporates deep learning to reduce the number of trial-and-error searches. We focus on a profit sharing (PS) method that is an XoL method, and combine it with a DQN to propose a DQNwithPS method. This method is compared with a DQN in Atari 2600 games. We demonstrate that the proposed DQNwithPS method can learn stably with fewer trial-and-error searches than required by only a DQN.

Publisher

Fuji Technology Press Ltd.

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction

Reference22 articles.

1. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing Atari with Deep Reinforcement Learning,” NIPS Deep Learning Workshop 2013, 2013.

2. C. J. H. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, Vol.8, pp. 55-68, 1992.

3. M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” J. of Artificial Intelligence Research, Vol.47, pp. 253-279, 2013.

4. K. Miyazaki, M. Yamamura, and H. Kobayashi, “A Theory of Profit Sharing in Reinforcement Learning,” Trans. of the Japanese Society for Artificial Intelligence, Vol.9, No.4, pp. 580-587, 1994 (in Japanese).

5. K.Miyazaki, M. Yamamura, and S. Kobayashi, “On the Rationality of Prot Sharing in Reinforcement Learning,” Proc of the 3rd Int. Conf. on Fuzzy Logic, Neural Nets and Soft Computing, pp. 285-288, 1994.

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhanced Naive Agent in Angry Birds AI Competition via Exploitation-Oriented Learning;Journal of Robotics and Mechatronics;2024-06-20

2. Proposal of a Course-Classification Support System Using Deep Learning and its Evaluation When Combined with Reinforcement Learning;Journal of Advanced Computational Intelligence and Intelligent Informatics;2024-03-20

3. Proposal and Evaluation of a Course-Classification-Support System Emphasizing Communication with the Sub-committees Within the Committee of Validation and Examination for Degrees;Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering;2023

4. Multi-Faceted Decision Making Using Multiple Reinforcement Learning to Reducing Wasteful Actions;Journal of Advanced Computational Intelligence and Intelligent Informatics;2022-07-20

5. Research on the Consistency of Diploma Policies and the Nomenclature of Major Fields of Academic Degrees;IEEJ Transactions on Electronics, Information and Systems;2022-02-01