Abstract
AbstractFor decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedback over time. Despite reasonable success of RL models in capturing choice on a trial-by-trial basis, these models cannot capture variability in matching behavior. To address this, we developed metrics based on information theory and applied them to choice data from dynamic learning tasks in mice and monkeys. We found that a single entropy-based metric can explain 50% and 41% of variance in matching in mice and monkeys, respectively. We then used limitations of existing RL models in capturing entropy-based metrics to construct more accurate models of choice. Together, our entropy-based metrics provide a model-free tool to predict adaptive choice behavior and reveal underlying neural mechanisms.
Funder
U.S. Department of Health & Human Services | National Institutes of Health
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry
Reference59 articles.
1. Herrnstein, R. J. Relative and absolute strength of response as a function of frequency of reinforcement. J. Exp. Anal. Behav. 4, 267–272 (1961).
2. Williams, B. A. Reinforcement, choice, and response strength. in Stevens’ handbook of experimental psychology vol. 2 167–244 (John Wiley & Sons, 1988).
3. de Villiers, P. A. & Herrnstein, R. J. Toward a law of response strength. Psychol. Bull. 83, 1131–1153 (1976).
4. William, B. M. Matching, undermatching, and overmatching in studies of choice. J. Exp. Anal. Behav. 32, 269–281 (1979).
5. Mazur, J. E. Optimization theory fails to predict performance of pigeons in a two-response situation. Science 214, 823–825 (1981).
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献