Deliberation gated by opportunity cost adapts to context with urgency

Author:

Touzel Maximilian PuelmaORCID,Cisek Paul,Lajoie Guillaume

Abstract

AbstractFinding the right amount of deliberation, between insufficient and excessive, is a hard decision making problem that depends on the value we place on our time. Average-reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning theory as the stationary opportunity cost of time, and of deliberation in particular. However, this cost often varies with environmental context that can change over time. Here, we introduce an opportunity cost of deliberation estimated adaptively on multiple timescales to account for non-stationary contextual factors. We use it in a simple decision-making heuristic based on average-reward reinforcement learning (AR-RL) that we call Performance-Gated Deliberation (PGD). We propose PGD as a strategy used by animals wherein deliberation cost is implemented directly as urgency, a previously characterized neural signal effectively controlling the speed of the decision-making process. We show PGD outperforms AR-RL solutions in explaining behaviour and urgency of non-human primates in a context-varying random walk prediction task and is consistent with relative performance and urgency in a context-varying random dot motion task. We make readily testable predictions for both neural activity and behaviour and call for an integrated research program in cognitive and systems neuroscience around the value of time.

Publisher

Cold Spring Harbor Laboratory

Reference61 articles.

1. Pain-Cost and Opportunity-Cost

2. Vektor Dewanto , George Dunn , Ali Eshragh , Marcus Gallagher , and Fred Roosta , “Averagereward model-free reinforcement learning: a systematic review and literature mapping,” arXiv:2010.08920 [cs.LG].

3. Long-Term Reward Prediction in TD Models of the Dopamine System

4. Context-sensitive valuation and learning;CurrentOpinion in Behavioral Sciences,2021

5. (Reinforcement?) Learning to forage optimally;CurrentOpinion in Neurobiology,2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3