Why do valence asymmetries emerge in value learning? A reinforcement learning account-Reference-Cited by-同舟云学术

Why do valence asymmetries emerge in value learning? A reinforcement learning account

Published:2022-12-28 Issue: Volume: Page:
ISSN:1530-7026
Container-title:Cognitive, Affective, & Behavioral Neuroscience
language:en
Short-container-title:Cogn Affect Behav Neurosci

Author:

Hao Chenxu^ORCID,Cabrera-Haro Lilian E.,Lin Ziyong,Reuter-Lorenz Patricia A.,Lewis Richard L.

Abstract

AbstractThe Value Learning Task (VLT; e.g., Raymond & O’Brien, 2009) is widely used to investigate how acquired value impacts how we perceive and process stimuli. The task consists of a series of trials in which participants attempt to maximize accumulated winnings as they make choices from a pair of presented images associated with probabilistic win, loss, or no-change outcomes. The probabilities and outcomes are initially unknown to the participant and thus the task involves decision making and learning under uncertainty. Despite the symmetric outcome structure for win and loss pairs, people learn win associations better than loss associations (Lin, Cabrera-Haro, & Reuter-Lorenz, 2020). This learning asymmetry could lead to differences when the stimuli are probed in subsequent tasks, compromising inferences about how acquired value affects downstream processing. We investigate the nature of the asymmetry using a standard error-driven reinforcement learning model with a softmax choice rule. Despite having no special role for valence, the model yields the learning asymmetry observed in human behavior, whether the model parameters are set to maximize empirical fit, or task payoff. The asymmetry arises from an interaction between a neutral initial value estimate and a choice policy that exploits while exploring, leading to more poorly discriminated value estimates for loss stimuli. We also show how differences in estimated individual learning rates help to explain individual differences in the observed win-loss asymmetries, and how the final value estimates produced by the model provide a simple account of a post-learning explicit value categorization task.

Funder

Friedrich-Alexander-Universität Erlangen-Nürnberg

Publisher

Springer Science and Business Media LLC

Subject

Behavioral Neuroscience,Cognitive Neuroscience

Link

https://link.springer.com/content/pdf/10.3758/s13415-022-01050-8.pdf

Reference25 articles.

1. Aberg, K., Müller, J., & Schwartz, S. (2017). Trial-by-trial modulation of associative memory formation by reward prediction error and reward anticipation as revealed by a biologically plausible computational model. Frontiers in Human Neuroscience, 11. https://doi.org/10.3389/fnhum.2017.00056.

2. Brosch, T., & Sander, D. (2013). Neurocognitive mechanisms underlying value-based decision-making: from core values to economic value. Frontiers in Human Neuroscience, 7, 398.

3. Daw, N. (2011). Trial-by-trial data analysis using computational models. In Decision making, affect, and learning: attention and performance XXIII. https://doi.org/10.1093/acprof:oso/9780199600434.003.0001https://doi.org/10.1093/acprof:oso/9780199600434.003.0001: Oxford University Press.

4. Daw, N., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8(12), 1704–1711.

5. Della, L.C., & Chelazzi, L. (2009). Learning to attend and to ignore is a matter of gains and losses. Psychological Science, 20(6), 778–84. https://doi.org/10.1111/j.1467-9280.2009.02360.x

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Uncertainty in learning and decision-making: Introduction to the special issue;Cognitive, Affective, & Behavioral Neuroscience;2023-05-24