A Trainable System for Object Detection

This paper presents a general, trainable system for object detection in unconstrained, cluttered scenes. The system derives much of its power from a representation that describes an object class in terms of an overcomplete dictionary of local, oriented, multiscale intensity differences between adjacent regions, efficiently computable as a Haar wavelet transform. This example-based learning approach implicitly derives a model of an object class by training a support vector machine classifier using a large set of positive and negative examples. We present results on face, people, and car detection tasks using the same architecture. In addition, we quantify how the representation affects detection performance by considering several alternate representations including pixels and principal components. We also describe a real-time application of our person detection system as part of a driver assistance system.

[1]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[2]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  J. Sklansky,et al.  Segmentation of people in motion , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[4]  Karl Rohr,et al.  Incremental recognition of pedestrians from image sequences , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[5]  R. Vaillant,et al.  Original approach for the localisation of objects in images , 1994 .

[6]  E. J. Stollnitz,et al.  Wavelets for Computer Graphics : A Primer , 1994 .

[7]  David Salesin,et al.  Wavelets for computer graphics: a primer.1 , 1995, IEEE Computer Graphics and Applications.

[8]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[9]  Kah Kay Sung,et al.  Learning and example selection for object and pattern detection , 1995 .

[10]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[11]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Other Conferences.

[12]  Pamela R. Lipson,et al.  Context and configuration based scene classification , 1996 .

[13]  Jitendra Malik,et al.  Learning Appearance Based Models: Mixtures of Second Moment Experts , 1996, NIPS.

[14]  Larry S. Davis,et al.  Highway scene analysis in hard real-time , 1997, Proceedings of Conference on Intelligent Transportation Systems.

[15]  David A. Forsyth,et al.  Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Shaogang Gong,et al.  Non-intrusive Person Authentication for Access Control by Visual Tracking and Face Recognition , 1997, AVBPA.

[17]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[18]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  W. Eric L. Grimson,et al.  Configuration based scene classification and image indexing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Jitendra Malik,et al.  A real-time computer vision system for measuring traffic parameters , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Margrit Betke,et al.  HIGHWAY SCENE ANALYSIS FROM A MOVING VEHICLE UNDER REDUCED VISIBILITY CONDITIONS , 1998 .

[24]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[25]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[27]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[28]  Dariu Gavrila,et al.  The Issues , 2011 .

[29]  Christian Wöhler,et al.  Motion-based recognition of pedestrians , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[30]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[31]  Christof Koch,et al.  Comparison of feature combination strategies for saliency-based visual attention systems , 1999, Electronic Imaging.

[32]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[33]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  David A. Forsyth,et al.  Automatic Detection of Human Nudes , 1999, International Journal of Computer Vision.