A generative framework for real time object detection and classification

We formulate a probabilistic model of image generation and derive optimal inference algorithms for finding objects and object features within this framework. The approach models images as a collage of patches of arbitrary size, some of which contain the object of interest and some of which are background. The approach requires development of likelihood-ratio models for object versus background generated patches. These models are learned using boosting methods. One advantage of the generative approach proposed here is that it makes explicit the conditions under which it is optimal. We applied the approach to the problem of finding faces and eyes on arbitrary images. Optimal inference under the proposed model works in real time and is robust to changes in lighting, illumination, and differences in facial structure, including facial expressions and eyeglasses. Furthermore, the system can simultaneously track the eyes and blinks of multiple individuals. Finally we reflect on how the development of perceptive systems like this may help advance our understanding of the human brain.

[1]  Jon Driver,et al.  Seen Gaze-Direction Modulates Fusiform Activity and Its Coupling with Other Brain Areas during Face Processing , 2001, NeuroImage.

[2]  Klaus J. Kirchberg,et al.  Robust Face Detection Using the Hausdorff Distance , 2001, AVBPA.

[3]  김대진,et al.  Robust Face Detection , 2007 .

[4]  P. Ekman Telling lies: clues to deceit in the marketplace , 1985 .

[5]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[6]  Paul A. Viola,et al.  A unified learning framework for real time face detection and classification , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[7]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[8]  Tzyy-Ping Jung,et al.  Extended ICA Removes Artifacts from Electroencephalographic Recordings , 1997, NIPS.

[9]  Marian Stewart Bartlett,et al.  A comparison of Gabor filter methods for automatic detection of facial landmarks , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[10]  N. Logothetis,et al.  Viewer-Centered Object Recognition in Monkeys , 1994 .

[11]  G. Casella,et al.  International Encyclopedia of the Social and Behavioral Sciences , 2001 .

[12]  Robert Frischholz,et al.  BioID: A Multimodal Biometric Identification System , 2000, Computer.

[13]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Gwen Littlewort,et al.  An approach to automatic recognition of spontaneous facial actions , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[15]  Karson Cn Physiology of normal and abnormal blinking. , 1988 .

[16]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[17]  Scott Makeig,et al.  Eye Activity Correlates of Fatigue during a Visual Tracking Task , 1998 .

[18]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[19]  J. Townsend,et al.  Computational, Geometric, and Process Perspectives on Facial Cognition : Contexts and Challenges , 2005 .

[20]  Javier R. Movellan,et al.  Probabilistic functionalism: A unifying paradigm for the cognitive sciences , 2001 .

[21]  Harry Wechsler,et al.  Eye Detection Using Optimal Wavelet Packets and Radial Basis Functions (RBFs) , 1999, Int. J. Pattern Recognit. Artif. Intell..

[22]  J. Cohn,et al.  Automatic recognition of eye blinking in spontaneously occurring behavior , 2002 .

[23]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[24]  Garrison W. Cottrell,et al.  Is All Face Processing Holistic? The View from UCSD , 2000 .

[25]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[26]  Qiang Ji,et al.  Real Time Visual Cues Extraction for Monitoring Driver Vigilance , 2001, ICVS.

[27]  Mark H. Johnson The development and neural basis of face recognition: comment and speculation , 2001 .

[28]  Qiang Ji,et al.  Real-Time Eye, Gaze, and Face Pose Tracking for Monitoring Driver Vigilance , 2002, Real Time Imaging.

[29]  M. Farah,et al.  What is "special" about face perception? , 1998, Psychological review.

[30]  Mark H. Johnson,et al.  Eye contact detection in humans from birth , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  T. K. Leungfj,et al.  Finding Faces in Cluttered Scenes using Random Labeled Graph Matching , 1995 .

[32]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[34]  M. Holland,et al.  Blinking and Mental Load , 1972, Psychological reports.

[35]  K. Nakamura,et al.  The human amygdala plays an important role in gaze monitoring. A PET study. , 1999, Brain : a journal of neurology.

[36]  Irfan A. Essa,et al.  Detecting and tracking eyes by using their physiological properties, dynamics, and appearance , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[37]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[38]  Refractor Vision , 2000, The Lancet.

[39]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[40]  Ravi Kothari,et al.  Detection of eye locations in unconstrained visual images , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[41]  T. Jung,et al.  Combined eye activity measures accurately estimate changes in sustained visual task performance , 2000, Biological Psychology.

[42]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, IEEE Trans. Pattern Anal. Mach. Intell..