Deeply learned face representations are sparse, selective, and robust

This paper designs a high-performance deep convolutional network (DeepID2+) for face recognition. It is learned with the identification-verification supervisory signal. By increasing the dimension of hidden representations and adding supervision to early convolutional layers, DeepID2+ achieves new state-of-the-art on LFW and YouTube Faces benchmarks. Through empirical studies, we have discovered three properties of its deep neural activations critical for the high performance: sparsity, selectiveness and robustness. (1) It is observed that neural activations are moderately sparse. Moderate sparsity maximizes the discriminative power of the deep net as well as the distance between images. It is surprising that DeepID2+ still can achieve high recognition accuracy even after the neural responses are binarized. (2) Its neurons in higher layers are highly selective to identities and identity-related attributes. We can identify different subsets of neurons which are either constantly excited or inhibited when different identities or attributes are present. Although DeepID2+ is not taught to distinguish attributes during training, it has implicitly learned such high-level concepts. (3) It is much more robust to occlusions, although occlusion patterns are not included in the training set.

[1]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Ming Yang,et al.  Web-scale training for face identification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yueting Zhuang,et al.  Sparse representation using nonnegative curds and whey , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  René Vidal,et al.  Robust classification using structured sparse representation , 2011, CVPR 2011.

[5]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[6]  Tal Hassner,et al.  Multiple One-Shots for Utilizing Class Label Information , 2009, BMVC.

[7]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Peter N. Belhumeur,et al.  Tom-vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification , 2012, BMVC.

[9]  Lei Zhang,et al.  Gabor Feature Based Sparse Representation for Face Recognition with Gabor Occlusion Dictionary , 2010, ECCV.

[10]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[11]  Xiaogang Wang,et al.  Hybrid Deep Learning for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Gang Hua,et al.  Eigen-PEP for Video Face Recognition , 2014, ACCV.

[13]  Xiaogang Wang,et al.  Deep Learning Identity-Preserving Face Space , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Xiaogang Wang,et al.  A unified framework for subspace face recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jitendra Malik,et al.  Analyzing the Performance of Multilayer Neural Networks for Object Recognition , 2014, ECCV.

[16]  Geoffrey E. Hinton,et al.  Robust Boltzmann Machines for recognition and denoising , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Andrew Zisserman,et al.  Fisher Vector Faces in the Wild , 2013, BMVC.

[18]  Jian Sun,et al.  An associate-predict model for face recognition , 2011, CVPR 2011.

[19]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[21]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[22]  Jian Sun,et al.  A Practical Transfer Learning Algorithm for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Xiaogang Wang,et al.  A Deep Sum-Product Architecture for Robust Facial Attributes Analysis , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Shenghuo Zhu,et al.  Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval , 2012, ArXiv.

[25]  Xiaoou Tang,et al.  Surpassing Human-Level Face Verification Performance on LFW with GaussianFace , 2014, AAAI.

[26]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[27]  Doris Y. Tsao,et al.  Mechanisms of face perception. , 2008, Annual review of neuroscience.

[28]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Trevor Darrell,et al.  PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[31]  Honglak Lee,et al.  Learning hierarchical representations for face verification with convolutional deep belief networks , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face Verification in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Yi Ma,et al.  Robust and Practical Face Recognition via Structured Sparsity , 2012, ECCV.

[37]  Donghoon Lee,et al.  Deep Attribute Networks , 2012, ArXiv.

[38]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[39]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[40]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[41]  Anil K. Jain,et al.  Unconstrained Face Recognition: Identifying a Person of Interest From a Media Collection , 2014, IEEE Transactions on Information Forensics and Security.

[42]  Jiwen Lu,et al.  Large Margin Multi-metric Learning for Face and Kinship Verification in the Wild , 2014, ACCV.

[43]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[44]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.