Author:
Hoang Huu,Tsutsumi Shinichiro,Matsuzaki Masanori,Kano Masanobu,Toyama Keisuke,Kitamura Kazuo,Kawato Mitsuo
Abstract
AbstractAlthough the cerebellum is typically linked to supervised learning algorithms, it also exhibits extensive connections to reward processing. In this study, we investigated the cerebellum’s role in executing reinforcement learning algorithms, with a particular emphasis on essential reward-prediction errors. We employed the Q-learning model to accurately reproduce the licking responses of mice in a Go/No-go auditory-discrimination task. This method enabled the calculation of reinforcement learning variables, such as reward, predicted reward, and reward-prediction errors in each learning trial. By tensor component analysis of two-photon Ca2+imaging data, we found that climbing fiber inputs of the two distinct components, which were specifically activated during Go and No-go cues in the learning process, showed an inverse relationship with predictive reward-prediction errors. Given the hypothesis of bidirectional parallel-fiber Purkinje-cell synaptic plasticity, Purkinje cells in these components could develop specific motor commands for their respective auditory cues, guided by the predictive reward-prediction errors from their climbing fiber inputs. These results indicate a possible role of context-specific actors in modular reinforcement learning, integrating with cerebellar supervised learning capabilities.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献