Learning a model of facial shape and expression from 4D scans

The field of 3D face modeling has a large gap between high-end and low-end methods. At the high end, the best facial animation is indistinguishable from real humans, but this comes at the cost of extensive manual labor. At the low end, face capture from consumer depth sensors relies on 3D face models that are not expressive enough to capture the variability in natural facial shape and expression. We seek a middle ground by learning a facial model from thousands of accurately aligned 3D scans. Our FLAME model (Faces Learned with an Articulated Model and Expressions) is designed to work with existing graphics software and be easy to fit to data. FLAME uses a linear shape space trained from 3800 scans of human heads. FLAME combines this linear shape space with an articulated jaw, neck, and eyeballs, pose-dependent corrective blendshapes, and additional global expression blendshapes. The pose and expression dependent articulations are learned from 4D face sequences in the D3DFACS dataset along with additional 4D sequences. We accurately register a template mesh to the scan sequences and make the D3DFACS registrations available for research purposes. In total the model is trained from over 33, 000 scans. FLAME is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model. We compare FLAME to these models by fitting them to static 3D scans and 4D sequences using the same optimization method. FLAME is significantly more accurate and is available for research purposes (http://flame.is.tue.mpg.de).

[1]  Mark Pauly,et al.  Dynamic 3D avatar creation from hand-held video input , 2015, ACM Trans. Graph..

[2]  Paul Debevec,et al.  The Digital Emily project: photoreal facial modeling and animation , 2009, SIGGRAPH '09.

[3]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[4]  Timo Bolkart,et al.  A Groupwise Multilinear Correspondence Optimization for 3D Faces , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Yu Chen,et al.  A Practical System for Modelling Body Shapes from Single View Measurements , 2011, BMVC.

[6]  Saïda Bouakaz,et al.  Easy acquisition and real‐time animation of facial wrinkles , 2011, Comput. Animat. Virtual Worlds.

[7]  Peter Robinson,et al.  A 3D Morphable Eye Region Model for Gaze Estimation , 2016, ECCV.

[8]  Michael J. Black,et al.  FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Adrian Hilton,et al.  A FACS valid 3D dynamic action unit database with applications to 3D dynamic morphable facial modeling , 2011, 2011 International Conference on Computer Vision.

[10]  Ira Kemelmacher-Shlizerman,et al.  Face reconstruction in the wild , 2011, 2011 International Conference on Computer Vision.

[11]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[12]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[13]  Xin Tong,et al.  Automatic acquisition of high-fidelity facial performances using monocular videos , 2014, ACM Trans. Graph..

[14]  Bernt Schiele,et al.  Building statistical shape spaces for 3D human modeling , 2015, Pattern Recognit..

[15]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[16]  Aaron Hertzmann,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis , 2022 .

[17]  Thomas Vetter,et al.  Expression invariant 3D face recognition with a Morphable Model , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[18]  Derek Bradley,et al.  Rigid stabilization of facial expressions , 2014, ACM Trans. Graph..

[19]  Stefanos Zafeiriou,et al.  Large Scale 3D Morphable Models , 2017, International Journal of Computer Vision.

[20]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..

[21]  Fei Yang,et al.  Expression flow for 3D-aware face component transfer , 2011, ACM Trans. Graph..

[22]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[24]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, ACM Trans. Graph..

[25]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[26]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[27]  Thabo Beeler,et al.  Real-time high-fidelity facial performance capture , 2015, ACM Trans. Graph..

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  Peter Robinson,et al.  A 3D Morphable Model of the Eye Region , 2016, Eurographics.

[30]  Hans-Peter Seidel,et al.  Interactive multi-resolution modeling on arbitrary meshes , 1998, SIGGRAPH.

[31]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2015, ACM Trans. Graph..

[32]  P. Ekman,et al.  Facial action coding system , 2019 .

[33]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[34]  Kathleen M. Robinette,et al.  Civilian American and European Surface Anthropometry Resource (CAESAR), Final Report. Volume 1. Summary , 2002 .

[35]  Feng Xu,et al.  Controllable high-fidelity facial performance transfer , 2014, ACM Trans. Graph..

[36]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  M. Pauly,et al.  Example-based facial rigging , 2010, ACM Trans. Graph..

[38]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[39]  Jun Li,et al.  Lightweight wrinkle synthesis for 3D facial modeling and animation , 2015, Comput. Aided Des..

[40]  Sami Romdhani,et al.  Optimal Step Nonrigid ICP Algorithms for Surface Registration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Jihun Yu,et al.  Realtime facial animation with on-the-fly correctives , 2013, ACM Trans. Graph..

[42]  Michael J. Black,et al.  OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[43]  Christian Theobalt,et al.  Reconstructing detailed dynamic face geometry from monocular video , 2013, ACM Trans. Graph..

[44]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[45]  Marcus A. Magnor,et al.  Sparse localized deformation components , 2013, ACM Trans. Graph..

[46]  Flavio Prieto,et al.  Fully automatic expression-invariant face correspondence , 2013, Machine Vision and Applications.

[47]  Justus Thies,et al.  Real-time expression transfer for facial reenactment , 2015, ACM Trans. Graph..

[48]  Stuart Geman,et al.  Statistical methods for tomographic image reconstruction , 1987 .

[49]  Christopher J. Taylor,et al.  Statistical models of shape - optimisation and evaluation , 2008 .

[50]  Jovan Popovic,et al.  Deformation transfer for triangle meshes , 2004, ACM Trans. Graph..

[51]  Mark Pauly,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[52]  Michael J. Black,et al.  Coregistration: Simultaneous Alignment and Modeling of Articulated 3D Shape , 2012, ECCV.

[53]  Alexander M. Bronstein,et al.  Numerical Geometry of Non-Rigid Shapes , 2009, Monographs in Computer Science.

[54]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[55]  Ira Kemelmacher-Shlizerman,et al.  What Makes Tom Hanks Look Like Tom Hanks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56]  A. Ponniah,et al.  Large Scale 3D Morphable Models , 2017, International Journal of Computer Vision.

[57]  Alberto Del Bimbo,et al.  Dictionary Learning Based 3D Morphable Model Construction for Face Recognition with Varying Expression and Pose , 2015, 2015 International Conference on 3D Vision.

[58]  Derek Bradley,et al.  An anatomically-constrained local deformation model for monocular face capture , 2016, ACM Trans. Graph..

[59]  Derek Bradley,et al.  High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..

[60]  Derek Bradley,et al.  Enriching Facial Blendshape Rigs with Physical Simulation , 2017, Comput. Graph. Forum.

[61]  Alan Brunton,et al.  Multilinear Wavelets: A Statistical Shape Space for Human Faces , 2014, ECCV.

[62]  Stefanos Zafeiriou,et al.  A 3D Morphable Model Learnt from 10,000 Faces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).