Author:
Capone Cristiano,Paolucci Pier Stanislao
Abstract
AbstractHumans and animals can learn new skills after practicing for a few hours, while current reinforcement learning algorithms require a large amount of data to achieve good performances. Recent model-based approaches show promising results by reducing the number of necessary interactions with the environment to learn a desirable policy. However, these methods require biological implausible ingredients, such as the detailed storage of older experiences, and long periods of offline learning. The optimal way to learn and exploit world-models is still an open question. Taking inspiration from biology, we suggest that dreaming might be an efficient expedient to use an inner model. We propose a two-module (agent and model) spiking neural network in which “dreaming” (living new experiences in a model-based simulated environment) significantly boosts learning. Importantly, our model does not require the detailed storage of experiences, and learns online the world-model and the policy. Moreover, we stress that our network is composed of spiking neurons, further increasing the biological plausibility and implementability in neuromorphic hardware.
Publisher
Springer Science and Business Media LLC
Reference32 articles.
1. Ye, W., Liu, S., Kurutach, T., Abbeel, P. & Gao, Y. Mastering atari games with limited data. Adv. Neural. Inf. Process. Syst. 34, 25476 (2021).
2. Abbeel, P., Quigley, M. & Ng, A. Y. Using inaccurate models in reinforcement learning. In Proceedings of the 23rd international conference on Machine learning 1–8 (2006).
3. Schrittwieser, J. et al. Mastering atari, go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020).
4. Ha, D. & Schmidhuber, J. Recurrent world models facilitate policy evolution. Adv. Neural. Inf. Process. Syst. 31, 145 (2018).
5. Kaiser, Ł. et al. Model based reinforcement learning for atari. In International Conference on Learning Representations (2019).