Hierarchical Reinforcement Learning Explains Task Interleaving Behavior-Reference-Cited by-同舟云学术

Hierarchical Reinforcement Learning Explains Task Interleaving Behavior

Published:2020-11-05 Issue:3 Volume:4 Page:284-304
ISSN:2522-0861
Container-title:Computational Brain & Behavior
language:en
Short-container-title:Comput Brain Behav

Author:

Gebhardt Christoph^ORCID,Oulasvirta Antti,Hilliges Otmar

Abstract

AbstractHow do people decide how long to continue in a task, when to switch, and to which other task? It is known that task interleaving adapts situationally, showing sensitivity to changes in expected rewards, costs, and task boundaries. However, the mechanisms that underpin the decision to stay in a task versus switch away are not thoroughly understood. Previous work has explained task interleaving by greedy heuristics and a policy that maximizes the marginal rate of return. However, it is unclear how such a strategy would allow for adaptation to environments that offer multiple tasks with complex switch costs and delayed rewards. Here, we develop a hierarchical model of supervisory control driven by reinforcement learning (RL). The core assumption is that the supervisory level learns to switch using task-specific approximate utility estimates, which are computed on the lower level. We show that a hierarchically optimal value function decomposition can be learned from experience, even in conditions with multiple tasks and arbitrary and uncertain reward and cost structures. The model also reproduces well-known key phenomena of task interleaving, such as the sensitivity to costs of resumption and immediate as well as delayed in-task rewards. In a demanding task interleaving study with 211 human participants and realistic tasks (reading, mathematics, question-answering, recognition), the model yielded better predictions of individual-level data than a flat (non-hierarchical) RL model and an omniscient-myopic baseline. Corroborating emerging evidence from cognitive neuroscience, our results suggest hierarchical RL as a plausible model of supervisory control in task interleaving.

Funder

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung

Publisher

Springer Science and Business Media LLC

Subject

Developmental and Educational Psychology,Neuropsychology and Physiological Psychology

Link

https://link.springer.com/content/pdf/10.1007/s42113-020-00093-9.pdf

Reference48 articles.

1. Altmann, E., & Trafton, J. (2002). Memory for goals: an activation-based model. Cognitive science, 26(1), 39–83.

2. Altmann, E., & Trafton, J. (2007). Timecourse of recovery from task interruption: data and a model. Psychon Bull Review, 14(6), 1079–1084.

3. Andre, D., & Russell, S. (2002). State abstraction for programmable reinforcement learning agents. In Eighteenth National Conference on Artificial Intelligence, 119–125.

4. Bailey, B., & Konstan, J. (2006). On the need for attention-aware systems: measuring effects of interruption on task performance, error rate, and affective state. In Computers in Human Behavior, (Vol. 22 pp. 685–708).

5. Balaguer, J., Spiers, H., Hassabis, D., & Summerfield, C. (2016). Neural mechanisms of hierarchical planning in a virtual subway network. Neuron, 90(4), 893–903.

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Workflow for Building Computationally Rational Models of Human Behavior;Computational Brain & Behavior;2024-08-15

2. Perceptions of a Robot that Interleaves Tasks for Multiple Users;ACM Transactions on Human-Robot Interaction;2024-05-23

3. Heads-Up Multitasker: Simulating Attention Switching On Optical Head-Mounted Displays;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

4. Supporting Task Switching with Reinforcement Learning;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

5. Explaining crowdworker behaviour through computational rationality;Behaviour & Information Technology;2024-04-24