Abstract
SummaryReward-prediction errors coded by dopaminergic neurons may underlie the acquisition of vocal skills such as birdsong or speech, where sounding like one’s “tutor” is intrinsically rewarding. It remains unclear what kind of intrinsic reward drives the acquisition of complex vocal behaviors. We elucidate the reward computation for learning a sound inventory in juvenile zebra finches that learn new song syllables. To implement the known efficiency of syllable learning regardless of their ordering, we consider a Multi-Actor Reinforcement Learning (MARL) model in which motor actors (for syllables) cooperate to maximize a common reward. This model outperforms computationally simpler alternatives, and successfully predicts a situation where a syllable is excluded from a song and replaced by a call. Key in our model is a sum-max operation over reward components corresponding to all possible pairwise comparisons between actors and target syllables. This operation predicts non-intuitive firing properties of dopaminergic neurons during song learning.
Publisher
Cold Spring Harbor Laboratory
Reference47 articles.
1. Sutton, R.S. , and Barto, A.G. Reinforcement learning: an introduction.
2. Doya, K. , and Sejnowski, T.J. (2002). A Computational Model of Avian Song Learning. The New Cognitive Neurosciences, 469–482.
3. Model of Birdsong Learning Based on Gradient Estimation by Dynamic Perturbation of Neural Conductances
4. Dopamine regulation of human speech and bird song: A critical review