Author:
van den Berg Alexandra R.,Roelfsema Pieter R.,Bohte Sander M.
Abstract
AbstractThe acquisition of knowledge does not occur in isolation; rather, learning experiences in the same or similar domains amalgamate. This process through which learning can accelerate over time is referred to as learning-to-learn or meta-learning. While meta-learning can be implemented in recurrent neural networks, these networks tend to be trained with architectures that are not easily interpretable or mappable to the brain and with learning rules that are biologically implausible. Specifically, these rules employ backpropagation-through-time for learning, which relies on information that is unavailable at synapses that are undergoing plasticity in the brain. While memory models that exclusively use local information for their weight updates have been developed, they have limited capacity to integrate information over long timespans and therefore cannot easily learn-to-learn. Here, we propose a novel gated recurrent network named RECOLLECT, which can flexibly retain or forget information by means of a single memory gate and biologically plausible trial-and-error-learning that requires only local information. We demonstrate that RECOLLECT successfully learns to represent task-relevant information over increasingly long memory delays in a pro-/anti-saccade task, and that it learns to flush its memory at the end of a trial. Moreover, we show that RECOLLECT can learn-to-learn an effective policy on a reversal bandit task. Finally, we show that the solutions acquired by RECOLLECT resemble how animals learn similar tasks.
Publisher
Cold Spring Harbor Laboratory
Reference56 articles.
1. Prefrontal cortex and decision making in a mixed-strategy game;Nature Neuroscience,2004
2. A solution to the learning dilemma for recurrent networks of spiking neurons;Nature Communications,2020
3. Thalamic projections sustain prefrontal activity during working memory maintenance;Nature Neuroscience,2017
4. Stimulus-Specific Visual Working Memory Representations in Human Cerebellar Lobule VIIb/VIIIa
5. Probability as a determiner of rat behavior.