A network that learns to recognize three-dimensional objects

THE visual recognition of three-dimensional (3-D) objects on the basis of their shape poses at least two difficult problems. First, there is the problem of variable illumination, which can be addressed by working with relatively stable features such as intensity edges rather than the raw intensity images1,2. Second, there is the problem of the initially unknown pose of the object relative to the viewer. In one approach to this problem, a hypothesis is first made about the viewpoint, then the appearance of a model object from such a viewpoint is computed and compared with the actual image3–7. Such recognition schemes generally employ 3-D models of objects, but the automatic learning of 3-D models is itself a difficult problem8,9. To address this problem in computational vision, we have developed a scheme, based on the theory of approximation of multivariate functions, that learns from a small set of perspective views a function mapping any viewpoint to a standard view. A network equivalent to this scheme will thus 'recognize' the object on which it was trained from any viewpoint.

[1]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[2]  S. Ullman The Interpretation of Visual Motion , 1979 .

[3]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[4]  H. C. Longuet-Higgins,et al.  A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[5]  Thomas S. Huang,et al.  Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Tomaso Poggio,et al.  Computational vision and regularization theory , 1985, Nature.

[7]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[8]  J. Mason,et al.  Algorithms for approximation , 1987 .

[9]  W. Eric L. Grimson,et al.  Localizing Overlapping Parts by Searching the Interpretation Tree , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  I. Rock,et al.  A case of viewer-centered object perception , 1987, Cognitive Psychology.

[11]  A. J. Mistlin,et al.  Visual neurones responsive to faces , 1987, Trends in Neurosciences.

[12]  T Poggio,et al.  Parallel integration of vision modules. , 1988, Science.

[13]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[14]  S. Ullman Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[15]  Shimon Edelman,et al.  Integrating visual cues for object segmentation and recognition , 1989 .