Abstract
AbstractVision is more than object recognition: In order to interact with the physical world, we estimate object properties such as mass, fragility, or elasticity by sight. The computational basis of this ability is poorly understood. Here, we propose a model based on the statistical appearance of objects, i.e., how they typically move, flow, or fold. We test this idea using a particularly challenging example: estimating the elasticity of bouncing objects. Their complex movements depend on many factors, e.g., elasticity, initial speed, and direction, and thus every object can produce an infinite number of different trajectories. By simulating and analyzing the trajectories of 100k bouncing cubes, we identified and evaluated 23 motion features that could individually or in combination be used to estimate elasticity. Experimentally teasing apart these competing but highly correlated hypotheses, we found that humans represent bouncing objects in terms of several different motion features but rely on just a single one when asked to estimate elasticity. Which feature this is, is determined by the stimulus itself: Humans rely on the duration of motion if the complete trajectory is visible, but on the maximal bounce height if the motion duration is artificially cut short. Our results suggest that observers take into account the computational costs when asked to judge elasticity and thus rely on a robust and efficient heuristic. Our study provides evidence for how such a heuristic can be derived—in an unsupervised manner—from observing the natural variations in many exemplars.Significance StatementHow do we perceive the physical properties of objects? Our findings suggest that when tasked with reporting the elasticity of bouncing cubes, observers rely on simple heuristics. Although there are many potential visual cues, surprisingly, humans tend to switch between just a handful of them depending on the characteristics of the stimulus. The heuristics predict not only the broad successes of human elasticity perception but also the striking pattern of errors observers make when we decouple the cues from ground truth. Using a big data approach, we show how the brain could derive such heuristics by observation alone. The findings are likely an example of ‘computational rationality’, in which the brain trades off task demands and relative computational costs.
Publisher
Cold Spring Harbor Laboratory