HyperBF Networks for Real Object Recognition

Even if represented in a way which is invariant to illumination conditions, a 3D object gives rise to an infinite number of 2D views, depending on its pose. It has been recently shown ([13]) that it is possible to synthesize a module that can recognize a specific 3D object from any viewpoint, by using a new technique of learning from examples, which are, in this case, a small set of 2D views of the object. In this paper we extend the technique, a) to deal with real objects (isolated paper clips) that suffer from noise and occlusions and b) to exploit negative examples during the learning phase. We also compare different versions of the multi-layer networks corresponding to our technique among themselves and with a standard Nearest Neighbor classifier. The simplest version, which is a Radial Basis Functions network, performs less well than a Nearest Neighbor classi-fier. The more powerful versions, trained with positive and negative examples, perform significantly better. Our results, which may have interesting implications for computer vision despite the relative simplicity of the task studied, are especially interesting for understanding the process of object recognition in biological vision. 1 Introduction Shape-based visual recognition of 3D objects may be solved by first hypothesizing the viewpoint (e.g., using information on feature correspondences between the image and a 3D model), then computing the appearance of the model of the object to be recognized from that viewpoint and comparing it with the actual image ([6; 20; 9; 11; 21]). Most recognition schemes developed in computer vision over the last few years employ 3D models of objects. Automatic learning of 3D models, however, is in itself a difficult problem that has not been much addressed in the past and which presents difficulties, especially for any theory that wants to account for human ability in visual recognition. Recently, recognition schemes have been suggested that, relying on a set of 2D views of the object instead of a 3D model ([2; 5; 13]), offer a natural solution to the problem of model acquisition. In particular, Poggio and Edelman ([13]) have argued that for each object there exists a smooth function mapping any perspective view into a "standard" view of the object and that this mul-tivariate function may be approximatevely synthesized from a small number of views of the object. Such a function would be object specific, with different functions corresponding to different 3D objects. …

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[3]  D. W. Thompson,et al.  Three-dimensional model matching from an unconstrained viewpoint , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[4]  Keiichi Abe,et al.  Binary picture thinning by an iterative parallel two-subcycle operation , 1987, Pattern Recognit..

[5]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[6]  Azriel Rosenfeld,et al.  Computer Vision , 1988, Adv. Comput..

[7]  S. Ullman Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[8]  Daphna Weinshall,et al.  A Self-Organizing Multiple-View Representation of Three- Dimensional Objects , 1989 .

[9]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[10]  F. Girosi,et al.  A Nondeterministic Minimization Algorithm , 1990 .

[11]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[12]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[13]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[14]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Shimon Edelman,et al.  Bringing the Grandmother back into the Picture: A Memory-Based View of Object Recognition , 1990, Int. J. Pattern Recognit. Artif. Intell..