Playing Atari with few neurons-Reference-Cited by-同舟云学术

Playing Atari with few neurons

Published:2021-04-19 Issue:2 Volume:35 Page:
ISSN:1387-2532
Container-title:Autonomous Agents and Multi-Agent Systems
language:en
Short-container-title:Auton Agent Multi-Agent Syst

Author:

Cuccu Giuseppe^ORCID,Togelius Julian,Cudré-Mauroux Philippe

Abstract

AbstractWe propose a new method for learning compact state representations and policies separately but simultaneously for policy approximation in vision-based applications such as Atari games. Approaches based on deep reinforcement learning typically map pixels directly to actions to enable end-to-end training. Internally, however, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it, two objectives which can be addressed independently. Separating the image processing from the action selection allows for a better understanding of either task individually, as well as potentially finding smaller policy representations which is inherently interesting. Our approach learns state representations using a compact encoder based on two novel algorithms: (i) Increasing Dictionary Vector Quantization builds a dictionary of state representations which grows in size over time, allowing our method to address new observations as they appear in an open-ended online-learning context; and (ii) Direct Residuals Sparse Coding encodes observations in function of the dictionary, aiming for highest information inclusion by disregarding reconstruction error and maximizing code sparsity. As the dictionary size increases, however, the encoder produces increasingly larger inputs for the neural network; this issue is addressed with a new variant of the Exponential Natural Evolution Strategies algorithm which adapts the dimensionality of its probability distribution along the run. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on each game’s controls). These are still capable of achieving results that are not much worse, and occasionally superior, to the state-of-the-art in direct policy search which uses two orders of magnitude more neurons.

Funder

National Science Foundation

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung

Université de Fribourg

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence

Link

https://link.springer.com/content/pdf/10.1007/s10458-021-09497-8.pdf

Reference50 articles.

1. Alvernaz, S., & Togelius, J. (2017). Autoencoder-augmented neuroevolution for visual doom playing. In Computational Intelligence and Games (CIG), 2017 IEEE Conference on, IEEE, pp 1–8.

2. Badia, A. P., Piot, B., Kapturowski, S., Sprechmann, P., Vitvitskyi, A., Guo, D., & Blundell, C. (2020). Agent57: Outperforming the Atari human benchmark. arXiv preprint arXiv:200313350.

3. Bellemare, M. G., Naddaf, Y., Veness, J., & Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47, 253–279.

4. Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). Openai gym. arXiv:1606.01540.

5. Chrabaszcz, P., Loshchilov, I., & Hutter, F. (2018). Back to basics: Benchmarking canonical evolution strategies for playing atari. arXiv preprint arXiv:180208842.

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improved Non-Player Character (NPC) behavior using evolutionary algorithm—A systematic review;Entertainment Computing;2025-01

2. Hybrid self-attention NEAT: a novel evolutionary self-attention approach to improve the NEAT algorithm in high dimensional inputs;Evolving Systems;2023-06-12

3. Study on the diversity of mental states and neuroplasticity of the brain during human-machine interaction;Frontiers in Neuroscience;2022-12-07

4. Fault-Tolerant Scheme of Cloud Task Allocation Based on Deep Reinforcement Learning;Communications in Computer and Information Science;2022

5. Realizing Midcourse Penetration With Deep Reinforcement Learning;IEEE Access;2021