Meta-control of the exploration-exploitation dilemma emerges from probabilistic inference over a hierarchy of time scales-Reference-Cited by-同舟云学术

Meta-control of the exploration-exploitation dilemma emerges from probabilistic inference over a hierarchy of time scales

Published:2020-12-28 Issue:3 Volume:21 Page:509-533
ISSN:1530-7026
Container-title:Cognitive, Affective, & Behavioral Neuroscience
language:en
Short-container-title:Cogn Affect Behav Neurosci

Author:

Marković Dimitrije,Goschke Thomas,Kiebel Stefan J.^ORCID

Abstract

AbstractCognitive control is typically understood as a set of mechanisms that enable humans to reach goals that require integrating the consequences of actions over longer time scales. Importantly, using routine behaviour or making choices beneficial only at short time scales would prevent one from attaining these goals. During the past two decades, researchers have proposed various computational cognitive models that successfully account for behaviour related to cognitive control in a wide range of laboratory tasks. As humans operate in a dynamic and uncertain environment, making elaborate plans and integrating experience over multiple time scales is computationally expensive. Importantly, it remains poorly understood how uncertain consequences at different time scales are integrated into adaptive decisions. Here, we pursue the idea that cognitive control can be cast as active inference over a hierarchy of time scales, where inference, i.e., planning, at higher levels of the hierarchy controls inference at lower levels. We introduce the novel concept of meta-control states, which link higher-level beliefs with lower-level policy inference. Specifically, we conceptualize cognitive control as inference over these meta-control states, where solutions to cognitive control dilemmas emerge through surprisal minimisation at different hierarchy levels. We illustrate this concept using the exploration-exploitation dilemma based on a variant of a restless multi-armed bandit task. We demonstrate that beliefs about contexts and meta-control states at a higher level dynamically modulate the balance of exploration and exploitation at the lower level of a single action. Finally, we discuss the generalisation of this meta-control concept to other control dilemmas.

Funder

Technische Universität Dresden

Publisher

Springer Science and Business Media LLC

Subject

Behavioral Neuroscience,Cognitive Neuroscience

Link

https://link.springer.com/content/pdf/10.3758/s13415-020-00837-x.pdf

Reference105 articles.

1. Addicott, M. A., Pearson, J. M., Sweitzer, M. M., Barack, D. L., & Platt, M. L. (2017). A primer on foraging and the explore/exploit trade-off for psychiatry research. Neuropsychopharmacology, 42(10), 1931-1939.

2. Agrawal, S., & Goyal, N. (2012). Analysis of thompson sampling for the multi-armed bandit problem. Paper presented at the Conference on learning theory.

3. Allesiardo, R., Féraud, R., & Maillard, O.-A. (2017). The non-stationary stochastic multi-armed bandit problem. International Journal of Data Science and Analytics, 3(4), 267-283.

4. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3), 235-256.

5. Bacon, P.-L., Harb, J., & Precup, D. (2017). The option-critic architecture. Paper presented at the Thirty-First AAAI Conference on Artificial Intelligence.

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Active Data Selection and Information Seeking;Algorithms;2024-03-12

2. Meta-control;Reference Module in Neuroscience and Biobehavioral Psychology;2024

3. Study on High-Level Structure of cognition control construction in Exploration and Exploitation within Multi-Armed Bandit Model of Reinforcement Learning;2023 International Symposium on Micro-NanoMehatronics and Human Science (MHS);2023-11-20

4. Cognitive effort and active inference;Neuropsychologia;2023-06

5. Post-injury pain and behaviour: a control theory perspective;Nature Reviews Neuroscience;2023-05-10