A general framework for object detection

This paper presents a general trainable framework for object detection in static images of cluttered scenes. The detection technique we develop is based on a wavelet representation of an object class derived from a statistical analysis of the class instances. By learning an object class in terms of a subset of an overcomplete dictionary of wavelet basis functions, we derive a compact representation of an object class which is used as an input to a support vector machine classifier. This representation overcomes both the problem of in-class variability and provides a low false detection rate in unconstrained environments. We demonstrate the capabilities of the technique in two domains whose inherent information content differs significantly. The first system is face detection and the second is the domain of people which, in contrast to faces, vary greatly in color, texture, and patterns. Unlike previous approaches, this system learns from examples and does not rely on any a priori (hand-crafted) models or motion-based segmentation. The paper also presents a motion-based extension to enhance the performance of the detection algorithm over video sequences. The results presented here suggest that this architecture may well be quite general.

[1]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[2]  Yoshiaki Shirai,et al.  Detection of the movements of persons from a sparse sequence of TV images , 1983, Pattern Recognition.

[3]  Tomaso A. Poggio,et al.  Pedestrian detection using wavelet templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Margrit Betke,et al.  Fast object recognition in noisy images using simulated annealing , 1995, Proceedings of IEEE International Conference on Computer Vision.

[5]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[7]  R. Vaillant,et al.  Original approach for the localisation of objects in images , 1994 .

[8]  Yee-Hong Yang,et al.  A region based approach for human body motion analysis , 1987, Pattern Recognit..

[9]  Ulrich Kressel,et al.  Tracking non-rigid, moving objects based on color cluster flow , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Shaogang Gong,et al.  Non-intrusive Person Authentication for Access Control by Visual Tracking and Face Recognition , 1997, AVBPA.

[11]  Yee-Hong Yang,et al.  Human body motion segmentation in a complex scene , 1987, Pattern Recognit..

[12]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Karl Rohr,et al.  Incremental recognition of pedestrians from image sequences , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[15]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.