Abstract
AbstractTo make good decisions in the real world, people need efficient planning strategies because their computational resources are limited. Knowing which planning strategies would work best for people in different situations would be very useful for understanding and improving human decision-making. Our ability to compute those strategies used to be limited to very small and very simple planning tasks. Here, we introduce a cognitively inspired reinforcement learning method that can overcome this limitation by exploiting the hierarchical structure of human behavior. We leverage it to understand and improve human planning in large and complex sequential decision problems. Our method decomposes sequential decision problems into two sub-problems: setting a goal and planning how to achieve it. Our method can discover optimal human planning strategies for larger and more complex tasks than was previously possible. The discovered strategies achieve a better tradeoff between decision quality and computational cost than both human planning and existing planning algorithms. We demonstrate that teaching people to use those strategies significantly increases their level of resource-rationality in tasks that require planning up to eight steps ahead. By contrast, none of the previous approaches was able to improve human performance on these problems. These findings suggest that our cognitively informed approach makes it possible to leverage reinforcement learning to improve human decision-making in complex sequential decision problems. Future work can leverage our method to develop decision support systems that improve human decision-making in the real world.
Funder
Cyber Valley Research Fund
Max Planck Institute for Intelligent Systems
Publisher
Springer Science and Business Media LLC
Subject
Developmental and Educational Psychology,Neuropsychology and Physiological Psychology
Reference50 articles.
1. Aronson, J E, Liang, T P, & MacCarthy, R V. (2005). Decision support systems and intelligent systems (Vol. 4). Upper Saddle River: Pearson Prentice-Hall.
2. Benjamini, Y, & Hochberg, Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1), 289–300.
3. Botvinick, M M (2008). Hierarchical models of behavior and prefrontal function. Trends in Cognitive Sciences, 12(5), 201–208.
4. Box, G E, et al. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems, i. Effect of inequality of variance in the one-way classification. The Annals of Mathematical Statistics, 25(2), 290–302.
5. Callaway, F, Lieder, F, Krueger, PM, & Griffiths, TL (2017). Mouselab-MDP: a new paradigm for tracing how people plan. In The 3rd multidisciplinary conference on reinforcement learning and decision making. https://osf.io/vmkrq/. Ann Arbor.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献