Author:
Sands L. Paul,Jiang Angela,Jones Rachel E.,Trattner Jonathan D.,Kishida Kenneth T.
Abstract
SUMMARYHow the human brain generates conscious phenomenal experience is a fundamental problem. In particular, it is unknown how variable and dynamic changes in subjective affect are driven by interactions with objective phenomena. We hypothesize a neurocomputational mechanism that generates valence-specific learning signals associated with ‘what it is like’ to be rewarded or punished. Our hypothesized model maintains a partition between appetitive and aversive information while generating independent and parallel reward and punishment learning signals. This valence-partitioned reinforcement learning (VPRL) model and its associated learning signals are shown to predict dynamic changes in 1) human choice behavior, 2) phenomenal subjective experience, and 3) BOLD-imaging responses that implicate a network of regions that process appetitive and aversive information that converge on the ventral striatum and ventromedial prefrontal cortex during moments of introspection. Our results demonstrate the utility of valence-partitioned reinforcement learning as a neurocomputational basis for investigating mechanisms that may drive conscious experience.HighlightsTD-Reinforcement Learning (RL) theory interprets punishments relative to rewards.Environmentally, appetitive and aversive events are statistically independent.Valence-partitioned RL (VPRL) processes reward and punishment independently.We show VPRL better accounts for human choice behavior and associated BOLD activity.VPRL signals predict dynamic changes in human subjective experience.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献