Face image analysis for expression measurement and detection of deceit

The Facial Action Coding System (FACS) (10) is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These methods include unsupervised learning techniques for finding basis images such as principal component analysis, independent component analysis and local feature analysis, and supervised learning techniques such as Fisher's linear discriminants. These data-driven bases are compared to Gabor wavelets, in which the basis images are predefined. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96% accuracy for classifying twelve facial actions. Once the basis images are, learned, the ICA representation takes 90% less CPU time than the Gabor representation to compute. The results provide evidence for the importance of using local image bases, high spatial frequencies, and statistical independence for classifying facial actions. Measurement of facial behavior at the level of detail of FACS provides information for detection of deceit. Applications to detection of deceit are discussed.

[1]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[2]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[3]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Garrison W. Cottrell,et al.  EMPATH: Face, Emotion, and Gender Recognition Using Holons , 1990, NIPS.

[5]  Eero P. Simoncelli Statistical models for images: compression, restoration and synthesis , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[6]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[7]  Penio S. Penev,et al.  Local feature analysis: A general statistical theory for object representation , 1996 .

[8]  Michael S. Gray i A comparison of local versus global image decompositions for visual speechreading , 1996 .

[9]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[10]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[11]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[12]  P. Ekman,et al.  Smiles when lying. , 1988, Journal of personality and social psychology.

[13]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[14]  M. Bartlett,et al.  Face image analysis by unsupervised learning and redundancy reduction , 1998 .

[15]  Garrison W. Cottrell,et al.  Representing Face Images for Emotion Classification , 1996, NIPS.

[16]  Marian Stewart Bartlett,et al.  Viewpoint Invariant Face Recognition using Independent Component Analysis and Attractor Networks , 1996, NIPS.