Multiscale filter bank approach to camera-movement control in active vision systems

Active vision systems faced with the problem of searching for targets in natural scenes are finding space variant sensors increasingly important. A popular class of space variant sensors are those based on the structure of the primate retina: they have a small area near the optical axis of greatly increased resolution (the fovea) coupled with a gradual falloff in resolution as one moves towards the periphery. In such systems, a primary requirement is an efficient mechanism for targeting the optical axis to different points of interest in the visual world. In this paper, we describe a robust and efficient paradigm for achieving foveal targeting by making use of iconic scene descriptions comprised of the responses of a bank of steerable filters at multiple scales and orientations. The filter bank description is rotation and scale invariant, occlusion tolerant, and view-insensitive to a considerable extent. We describe procedures for robustly matching vectors of a previously foveated point to instances of the point in other possibly transformed images obtained after camera motion. In such situations, the multiscale structure of the filter bank lends itself naturally to an efficient targeting mechanism for the space variant sensor.

[1]  Christopher M. Brown,et al.  Real-time smooth pursuit tracking , 1993 .

[2]  Michael Kass,et al.  Computing Visual Correspondence , 1983 .

[3]  C. Blakemore,et al.  Vision: The iconic bottleneck and the tenuous link between early visual processing and perception , 1990 .

[4]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  D. Ballard,et al.  Object recognition using steerable filters at multiple scales , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[6]  K Nakayama,et al.  Toward a neural understanding of visual surface representation. , 1990, Cold Spring Harbor symposia on quantitative biology.

[7]  Hiroshi Murase,et al.  Learning and recognition of 3D objects from appearance , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[8]  Jitendra Malik,et al.  Computational framework for determining stereo correspondence from a set of linear spatial filters , 1992, Image Vis. Comput..

[9]  John K. Tsotsos The Complexity of Perceptual Search Tasks , 1989, IJCAI.

[10]  Giulio Sandini,et al.  Dynamic aspects in active vision , 1992, CVGIP Image Underst..

[11]  Michael J. Swain,et al.  Low resolution cues for guiding saccadic eye movements , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Jitendra Malik,et al.  A Computational Framework for Determining Stereo Correspondence from a Set of Linear Spatial Filters , 1991, ECCV.

[13]  Carl F. R. Weiman,et al.  Logarithmic spiral grids for image-processing and display , 1979 .

[14]  Rajesh P. N. Rao,et al.  Seeing Behind Occlusions , 1994, ECCV.

[15]  R. Carpenter,et al.  Movements of the Eyes , 1978 .

[16]  Jitendra Malik,et al.  A Computational Model Of Texture Segmentation , 1988, Twenty-Second Asilomar Conference on Signals, Systems and Computers.