Author:
Dossa Rousslan Fernand Julien,Arulkumaran Kai,Juliani Arthur,Sasai Shuntaro,Kanai Ryota
Abstract
As the apparent intelligence of artificial neural networks (ANNs) advances, they are increasingly likened to the functional networks and information processing capabilities of the human brain. Such comparisons have typically focused on particular modalities, such as vision or language. The next frontier is to use the latest advances in ANNs to design and investigate scalable models of higher-level cognitive processes, such as conscious information access, which have historically lacked concrete and specific hypotheses for scientific evaluation. In this work, we propose and then empirically assess an embodied agent with a structure based on global workspace theory (GWT) as specified in the recently proposed “indicator properties” of consciousness. In contrast to prior works on GWT which utilized single modalities, our agent is trained to navigate 3D environments based on realistic audiovisual inputs. We find that the global workspace architecture performs better and more robustly at smaller working memory sizes, as compared to a standard recurrent architecture. Beyond performance, we perform a series of analyses on the learned representations of our architecture and share findings that point to task complexity and regularization being essential for feature learning and the development of meaningful attentional patterns within the workspace.
Reference85 articles.
1. Neural mechanisms underlying visual object recognition;Afraz;Cold Spring Harb. Symp. Quant. Biol,2014
2. Deep reinforcement learning at the edge of the statistical precipice;Agarwal;Adv. Neural Inf. Process. Syst,2021
3. Solving Rubik's cube with a robot hand;Akkaya;arXiv,2019
4. Understanding intermediate layers using linear classifier probes;Alain;arXiv,2016
5. Layer normalization;Ba;arXiv,2016