Abstract
AbstractValue learning can be variable for objects with the exact same reward but the underlying neural mechanism for such variability is not known. We have addressed this question by recording single-unit activity in the prefrontal cortex (PFC) and substantia nigra reticulata (SNr), two key nodes in cortex-basal ganglia circuitry with crucial roles in value learning, as macaque monkeys learned to associate novel objects with either high or low rewards. Estimating the trial by trial learned values based on choice performance, revealed stark differences between learned values across objects with the same reward outcome. Importantly, while PFC neurons rapidly learned to differentiate objects based on their value category, the firing in SNr correlated with the variability in learned values within a value category. Our results suggest that the variation in objects’ learned values is more likely to be a readout of SNr firing while PFC may provide a top-down teaching signal to basal ganglia to demarcate the value categories.
Publisher
Cold Spring Harbor Laboratory
Reference37 articles.
1. R. S. Sutton and A. G. Barto , Reinforcement Learning: An Introduction, 1998.
2. R. S. Sutton and A. G. Barto , “Toward a modern theory of adaptive networks: expectation and prediction.,” Psychological review, vol. 88, no. 2, p. 135, 1981.
3. R. A. Rescorla and A. R. Wagner , “A theory of Pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement.,” AH Black & WF Prokasy,, pp. 64–69, 1972.
4. Mackintosh and N. J , “A theory of attention: Variations in the associability of stimuli with reinforcement.,” Psychological review, vol. 82, no. 4, p. 276, 1975.
5. “Uncertainty and learning;IETE Journal of Research,2003