Attention mechanisms and component-based face detection

Locating and identifying complex objects in a visual scene is a typical problem within the areas of computer vision and image analysis. One technique to minimise the size of image to be identified is to base the classification on smaller features of the image, which are combined into a more complex structure to identify the complete object. For example, locating two eyes, a nose and a mouth can enable us to identify a face without paying attention to the hair, chin or cheeks. In this paper, we present a system and training technique for learning to recognise an object from its component features. Our system incorporates an attention-based mechanism to predict the location of features. We demonstrate the effectiveness of our system with an experiment in face detection; the attention mechanism is shown to improve the overall classification speed and accuracy of feature location.

[1]  Thomas Serre,et al.  Categorization by Learning and Combining Object Parts , 2001, NIPS.

[2]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[3]  Linda G. Shapiro,et al.  Computer Vision , 2001 .

[4]  Sven Behnke,et al.  Neural abstraction pyramid: a hierarchical image understanding architecture , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[5]  Terry Caelli,et al.  Learning Image Annotation : The CITE System , 1998 .

[6]  Thomas Serre,et al.  Component-based face detection , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[8]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[9]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[10]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[11]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[12]  Azriel Rosenfeld,et al.  Computer Vision , 1988, Adv. Comput..

[13]  Neil Davey,et al.  Comparing the performance of single-layer and two-layer support vector machines on face detection , 2007 .

[14]  Fernand Gobet,et al.  Attention Mechanisms in the CHREST Cognitive Architecture , 2009, WAPCV.

[15]  Luca Lombardi,et al.  Hierarchical architectures for computer vision , 1995, Proceedings Euromicro Workshop on Parallel and Distributed Processing.

[16]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[17]  J. Pine,et al.  Chunking mechanisms in human learning , 2001, Trends in Cognitive Sciences.

[18]  Neil Davey,et al.  A DUAL-LAYER MODEL OF HIGH-LEVEL PERCEPTION , 2008 .

[19]  Volker Blanz,et al.  Face Recognition Using Component-Based SVM Classification and Morphable Models , 2002, SVM.