Author:
Rigoli Lillian M.,Patil Gaurav,Stening Hamish F.,Kallen Rachel W.,Richardson Michael J.
Abstract
Rapid advances in the field of Deep Reinforcement Learning (DRL) over the past several years have led to artificial agents (AAs) capable of producing behavior that meets or exceeds human-level performance in a wide variety of tasks. However, research on DRL frequently lacks adequate discussion of the low-level dynamics of the behavior itself and instead focuses on meta-level or global-level performance metrics. In doing so, the current literature lacks perspective on the qualitative nature of AA behavior, leaving questions regarding the spatiotemporal patterning of their behavior largely unanswered. The current study explored the degree to which the navigation and route selection trajectories of DRL agents (i.e., AAs trained using DRL) through simple obstacle ridden virtual environments were equivalent (and/or different) from those produced by human agents. The second and related aim was to determine whether a task-dynamical model of human route navigation could not only be used to capture both human and DRL navigational behavior, but also to help identify whether any observed differences in the navigational trajectories of humans and DRL agents were a function of differences in the dynamical environmental couplings.
Funder
Australian Research Council
Reference70 articles.
1. From physics to social interactions: scientific unification via dynamics;Amazeen;Cogn. Syst. Res.,2018
2. Dynamics of human intersegmental coordination: theory and research;Amazeen,1998
3. How to avoid being eaten by a Grue: structured exploration strategies for textual worlds;Ammanabrolu;ArXiv,2020
4. Deep reinforcement learning: a brief survey;Arulkumaran;IEEE Signal Process. Mag.,2017
5. A framework for behavioural cloning;Bain,1999