Making a scene – using GAN generated scenes to test the role of real-world co-occurence statistics and hierarchical feature spaces in scene understanding.


Kallmayer Aylin1ORCID,Võ Melissa2


1. Goethe University

2. Goethe-University Frankfurt


Abstract Our visual surroundings are highly complex. Despite this, we understand and navigate them effortlessly. This requires a complex series of transformations resulting in representations that not only span low- to high-level visual features (e.g., contours, textures, object parts and objects), but likely also reflect co-occurrence statistics of objects in real-world scenes. Here, so-called anchor objects reflect clustering statistics in real-world scenes, anchoring predictions towards frequently co-occuring smaller objects, while so-called diagnostic objects predict the larger semantic context. We investigate which of these properties underly scene understanding across two dimensions – realism and categorisation – using scenes generated from Generative Adversarial Networks (GANs) which naturally vary along these dimensions. We show that anchor objects and mainly high-level features extracted from a range of pre-trained deep neural networks (DNNs) drove realism both at first glance and after initial processing. Categorisation performance was mainly determined by diagnostic objects, regardless of realism and DNN features, also at first glance and after initial processing. Our results are testament to the visual system’s ability to pick up on reliable, category specific sources of information that are flexible towards disturbances across the visual feature hierarchy.


Research Square Platform LLC

Reference66 articles.

1. The Briefest of Glances: The Time Course of Natural Scene Understanding;Greene MR;Psychol Sci,2009

2. Human gaze control during real-world scene perception;Henderson JM;Trends in Cognitive Sciences,2003

3. Rapid conceptual identification of sequentially presented pictures;Intraub H;Journal of Experimental Psychology: Human Perception and Performance,1981

4. Diagnostic Colors Mediate Scene Recognition;Oliva A;Cognitive Psychology,2000

5. Detecting meaning in RSVP at 13 ms per picture;Potter MC;Atten Percept Psychophys,2014







Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3