Robust Face Alignment by Multi-Order High-Precision Hourglass Network

Heatmap regression (HR) has become one of the mainstream approaches for face alignment and has obtained promising results under constrained environments. However, when a face image suffers from large pose variations, heavy occlusions and complicated illuminations, the performances of HR methods degrade greatly due to the low resolutions of the generated landmark heatmaps and the exclusion of important high-order information that can be used to learn more discriminative features. To address the alignment problem for faces with extremely large poses and heavy occlusions, this paper proposes a heatmap subpixel regression (HSR) method and a multi-order cross geometry-aware (MCG) model, which are seamlessly integrated into a novel multi-order high-precision hourglass network (MHHN). The HSR method is proposed to achieve high-precision landmark detection by a well-designed subpixel detection loss (SDL) and subpixel detection technology (SDT). At the same time, the MCG model is able to use the proposed multi-order cross information to learn more discriminative representations for enhancing facial geometric constraints and context information. To the best of our knowledge, this is the first study to explore heatmap subpixel regression for robust and high-precision face alignment. The experimental results from challenging benchmark datasets demonstrate that our approach outperforms state-of-the-art methods in the literature.

[1]  Yi Yang,et al.  Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Wenyan Wu,et al.  Leveraging Intra and Inter-Dataset Variations for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[4]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Qinghua Hu,et al.  Deep Global Generalized Gaussian Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Shiguang Shan,et al.  Occlusion-Free Face Alignment: Deep Regression Networks Coupled with De-Corrupt AutoEncoders , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[9]  Josef Kittler,et al.  Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[11]  Xin Jin,et al.  Face alignment in-the-wild: A Survey , 2016, Comput. Vis. Image Underst..

[12]  Larry S. Davis,et al.  Cross-X Learning for Fine-Grained Visual Categorization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  William J. Christmas,et al.  Dynamic Attention-Controlled Cascaded Shape Regression Exploiting Training Data Augmentation and Fuzzy-Set Sample Weighting , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Qingshan Liu,et al.  Stacked Hourglass Network for Robust Facial Landmark Localisation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Heng Yang,et al.  Facial feature point detection: A comprehensive survey , 2014, Neurocomputing.

[18]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Yici Cai,et al.  Look at Boundary: A Boundary-Aware Face Alignment Algorithm , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[21]  Mingjie Zheng,et al.  Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  William J. Christmas,et al.  Cascaded Collaborative Regression for Robust Facial Landmark Detection Trained Using a Mixture of Synthetic and Real Images With Dynamic Weighting , 2015, IEEE Transactions on Image Processing.

[23]  Charless C. Fowlkes,et al.  Occlusion Coherence: Localizing Occluded Faces with a Hierarchical Deformable Part Model , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Gerhard Rigoll,et al.  Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[26]  Georgios Tzimiropoulos,et al.  Human Pose Estimation via Convolutional Part Heatmap Regression , 2016, ECCV.

[27]  Gerhard Rigoll,et al.  A Deep Convolutional Neural Network for Background Subtraction , 2017, ArXiv.

[28]  Hanjiang Lai,et al.  Robust Facial Landmark Detection via Recurrent Attentive-Refinement Networks , 2016, ECCV.

[29]  Shu-Tao Xia,et al.  Second-Order Attention Network for Single Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jun Chang,et al.  Face alignment by Component Adaptive Mechanism , 2019, Neurocomputing.

[31]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Subhransu Maji,et al.  Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[35]  Ming Tang,et al.  Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Cheng Li,et al.  Unconstrained Face Alignment via Cascaded Compositional Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Bo Du,et al.  Robust face alignment by cascaded regression and de-occlusion , 2019, Neural Networks.

[41]  Qilong Wang,et al.  Hyperlayer Bilinear Pooling with application to fine-grained categorization and image retrieval , 2017, Neurocomputing.

[42]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..

[43]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[46]  Yi Yang,et al.  Style Aggregated Network for Facial Landmark Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Qilong Wang,et al.  Global Second-Order Pooling Convolutional Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  José Miguel Buenaposada,et al.  A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment , 2018, ECCV.

[49]  Xiaoming Liu,et al.  Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos , 2019, AAAI.

[50]  Maja Pantic,et al.  Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.