Count-Based Exploration with the Successor Representation-Reference-Cited by-同舟云学术

Count-Based Exploration with the Successor Representation

Published:2020-04-03 Issue:04 Volume:34 Page:5125-5133
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Machado Marlos C.,Bellemare Marc G.,Bowling Michael

Abstract

In this paper we introduce a simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required. Our approach is based on the successor representation (SR), which was originally introduced as a representation defining state generalization by the similarity of successor states. Here we show that the norm of the SR, while it is being learned, can be used as a reward bonus to incentivize exploration. In order to better understand this transient behavior of the norm of the SR we introduce the substochastic successor representation (SSR) and we show that it implicitly counts the number of times each state (or feature) has been observed. We use this result to introduce an algorithm that performs as well as some theoretically sample-efficient approaches. Finally, we extend these ideas to a deep RL algorithm and show that it achieves state-of-the-art performance in Atari 2600 games when in a low sample-complexity regime.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Autoencoder Reconstruction Model for Long-Horizon Exploration;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

2. CMBE: Curiosity-driven Model-Based Exploration for Multi-Agent Reinforcement Learning in Sparse Reward Settings;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

3. Deep Reinforcement Learning for Unpredictability-Induced Rewards to Handle Spacecraft Landing;2023 13th International Conference on Information Science and Technology (ICIST);2023-12-08

4. Balancing exploration and exploitation in episodic reinforcement learning;Expert Systems with Applications;2023-11

5. Learning Adaptation and Generalization from Human-Inspired Meta-Reinforcement Learning Using Bayesian Knowledge and Analysis;2023 IEEE Sixth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE);2023-09-25