Author:
Dubois M,Habicht J,Michely J,Moran R,Dolan RJ,Hauser TU
Abstract
AbstractAn exploration-exploitation trade-off, the arbitration between sampling a lesser-known against a known rich option, is thought to be solved using computationally demanding exploration algorithms. Given known limitations in human cognitive resources, we hypothesised the presence of additional cheaper strategies. We examined for such heuristics in choice behaviour where we show this involves a value-free random exploration, that ignores all prior knowledge, and a novelty exploration that targets novel options alone. In a double-blind, placebo-controlled drug study, assessing contributions of dopamine (400mg amisulpride) and noradrenaline (40mg propranolol), we show that value-free random exploration is attenuated under the influence of propranolol, but not under amisulpride. Our findings demonstrate that humans deploy distinct computationally cheap exploration strategies and where value-free random exploration is under noradrenergic control.Data and materials availabilityData and code will be provided upon acceptance.
Publisher
Cold Spring Harbor Laboratory
Reference107 articles.
1. Deconstructing the human algorithms for exploration
2. The algorithmic architecture of exploration in the human brain;Curr. Opin. Neurobiol,2019
3. Using confidence bounds for exploitation-exploration trade-offs;J. Mach. Learn. Res,2003
4. Upper-confidence-bound algorithms for active learning in multi-armed bandits;Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif Intell. Lect. Notes Bioinformatics),2011
5. Computational mechanisms of curiosity and goal-directed exploration
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献