Abstract
The replay of task-relevant trajectories is known to contribute to memory consolidation and improved task performance. A wide variety of experimental data show that the content of replayed sequences is highly specific and can be modulated by reward as well as other prominent task variables. However, the rules governing the choice of sequences to be replayed still remain poorly understood. One recent theoretical suggestion is that the prioritization of replay experiences in decision-making problems is based on their effect on the choice of action. We show that this implies that subjects should replay sub-optimal actions that they dysfunctionally choose rather than optimal ones, when, by being forgetful, they experience large amounts of uncertainty in their internal models of the world. We use this to account for recent experimental data demonstrating exactly pessimal replay, fitting model parameters to the individual subjects' choices.
Publisher
Cold Spring Harbor Laboratory
Reference56 articles.
1. Agrawal, Mayank , Marcelo Gomes Mattar , Jonathan D Cohen , and Nathaniel Douglass Daw (2020). “The temporal dynamics of opportunity costs: A normative account of cognitive fatigue and boredom”. In: bioRxiv. DOI:https://doi.org/10.1101/2020.9.08.287276.
2. Reverse replay of hippocampal place cells is uniquely modulated by changing reward;Neuron,2016
3. Replays of spatial memories suppress topological fluctuations in cognitive map;Network Neuroscience,2019
4. A generative spiking neural-network model of goal-directed behaviour and one-step planning;PLOS Computational Biology,2020
5. Improving Generalization for Temporal Difference Learning: The Successor Representation
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献