Abstract
For human vision to be explained by a computational theory, the first question is plain: What are the problems that the brain solves when we see? It is argued that vision is the construction of efficient symbolic descriptions from images of the world. An important aspect of vision is therefore the choice of representations for the different kinds of information in a visual scene. An overall framework is suggested for extracting shape information from images, in which the analysis proceeds through three representations: (1) the primal sketch, which makes explicit the intensity changes and local two-dimensional geometry of an image; (2) the 2 1/2-D sketch, which is a viewercentred representation of the depth, orientation and discontinuities of the visible surfaces; and (3) the 3-D model representation, which allows an object-centred description of the three-dimensional structure and organization of a viewed shape. The critical act in formulating computational theories for processes capable of constructing these representations is the discovery of valid constraints on the way the world behaves, that provide sufficient additional information to allow recovery of the desired characteristic. Finally, once a computational theory for a process has been formulated, algorithms for implementing it may be designed, and their performance compared with that of the human visual processor.
Subject
Industrial and Manufacturing Engineering,General Agricultural and Biological Sciences,General Business, Management and Accounting,Materials Science (miscellaneous),Business and International Management
Cited by
144 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献