Detailed Human Shape and Pose from Images

Much of the research on video-based human motion capture assumes the body shape is known a priori and is represented coarsely (e.g. using cylinders or superquadrics to model limbs). These body models stand in sharp contrast to the richly detailed 3D body models used by the graphics community. Here we propose a method for recovering such models directly from images. Specifically, we represent the body using a recently proposed triangulated mesh model called SCAPE which employs a low-dimensional, but detailed, parametric model of shape and pose-dependent deformations that is learned from a database of range scans of human bodies. Previous work showed that the parameters of the SCAPE model could be estimated from marker-based motion capture data. Here we go further to estimate the parameters directly from image data. We define a cost function between image observations and a hypothesized mesh and formulate the problem as optimization over the body shape and pose parameters using stochastic search. Our results show that such rich generative models enable the automatic recovery of detailed human shape and pose from images.

[1]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[2]  Dragomir Anguelov,et al.  Object Pose Detection in Range Scan Data , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Michael J. Black,et al.  Shining a Light on Human Pose: On Shadows, Shading and the Estimation of Pose and Shape , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  M. Isard,et al.  Automatic Camera Calibration from a Single Manhattan Image , 2002, ECCV.

[5]  Takeo Kanade,et al.  Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking , 2005, International Journal of Computer Vision.

[6]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[7]  Jessica K. Hodgins,et al.  Data-driven modeling of skin and muscle deformation , 2008, SIGGRAPH 2008.

[8]  Bruno Raffin,et al.  3D Skeleton-Based Body Pose Recovery , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[9]  Ankur Agarwal,et al.  Learning to track 3D human motion from silhouettes , 2004, ICML.

[10]  Zoran Popovic,et al.  Articulated body deformation from range scan data , 2002, SIGGRAPH.

[11]  Gordon Clapworthy,et al.  An Anatomy-Based Approach to Human Muscle Modeling and Deformation , 2002, IEEE Trans. Vis. Comput. Graph..

[12]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[13]  Pietro Perona,et al.  3D Reconstruction by Shadow Carving: Theory and Practical Evaluation , 2007, International Journal of Computer Vision.

[14]  Hans-Peter Seidel,et al.  Cloth X-Ray: MoCap of People Wearing Textiles , 2006, DAGM-Symposium.

[15]  Aaron Hertzmann,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis , 2022 .

[16]  Stefano Corazza,et al.  Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Ramakant Nevatia,et al.  Building Detection and Description from a Single Intensity Image , 1998, Comput. Vis. Image Underst..

[18]  Ronen Basri,et al.  Example Based 3D Reconstruction from Single 2D Images , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[19]  Kari Pulli,et al.  Real-time enveloping with rotational regression , 2007, ACM Trans. Graph..

[20]  Alex Pentland,et al.  Modal Matching for Correspondence and Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Luca Ballan,et al.  Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes , 2008 .

[22]  Alain Fournier,et al.  A Framework for Automatically Recovering Object Shape, Reflectance and Light Sources from Calibrated Images , 2006, International Journal of Computer Vision.

[23]  Pascal Fua,et al.  Surface Deformation Models for Nonrigid 3D Shape Recovery , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jeffrey C. Lagarias,et al.  Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions , 1998, SIAM J. Optim..

[25]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[26]  Wei Sun,et al.  Whole-body modelling of people from multiview images to populate virtual worlds , 2000, The Visual Computer.

[27]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[28]  Andrew W. Fitzgibbon,et al.  Parallax geometry of smooth surfaces in multiple views , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[29]  Mohan M. Trivedi,et al.  Detecting Moving Shadows: Algorithms and Evaluation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Pascal Fua,et al.  Articulated Soft Objects for Multiview Shape and Motion Capture , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  David G. Kirkpatrick,et al.  Linear Time Euclidean Distance Algorithms , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Ramakant Nevatia,et al.  Structured Descriptions of Complex Objects , 1973, IJCAI.

[33]  David J. Fleet,et al.  Model-based hand tracking with texture, shading and self-occlusions , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Ioannis A. Kakadiaris,et al.  Model-Based Estimation of 3D Human Motion , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Xuelong Li,et al.  Gait Components and Their Application to Gender Recognition , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[36]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[37]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[38]  Ventseslav Sainov,et al.  3-D Time-Varying Scene Capture Technologies—A Survey , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[40]  Camillo J. Taylor,et al.  Reconstruction of articulated objects from point correspondences in a single uncalibrated image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[41]  Michael J. Black,et al.  Learning the Statistics of People in Images and Video , 2003, International Journal of Computer Vision.

[42]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[43]  Adrian Hilton,et al.  Volumetric Stereo with Silhouette and Feature Constraints , 2006, BMVC.

[44]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[45]  Pascal Fua,et al.  The Radiometry of Multiple Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[47]  Nadia Magnenat-Thalmann,et al.  An automatic modeling of human bodies from sizing parameters , 2003, I3D '03.

[48]  Michael J. Black,et al.  Combined discriminative and generative articulated pose and non-rigid shape estimation , 2007, NIPS.

[49]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Mun Wai Lee,et al.  3D Body Reconstruction for Immersive Interaction , 2002, AMDO.

[52]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[53]  Alan L. Yuille,et al.  Learning Object Representation form Lighting Variations , 1996, Object Representation in Computer Vision.

[54]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[55]  Alfred M. Bruckstein,et al.  On the use of shadows in stance recovery , 2000, Int. J. Imaging Syst. Technol..

[56]  Hui Gao,et al.  Gender Recognition from Walking Movements using Adaptive Three-Mode PCA , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[57]  Mumin Song,et al.  Overview of three-dimensional shape measurement using optical methods , 2000 .

[58]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[59]  Ioannis A. Kakadiaris,et al.  Estimating Anthropometry and Pose from a Single Uncalibrated Image , 2001, Comput. Vis. Image Underst..

[60]  Olga Sorkine-Hornung,et al.  Context‐Aware Skeletal Shape Deformation , 2007, Comput. Graph. Forum.

[61]  R. Plankers,et al.  Articulated soft objects for video-based body modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[62]  Hsi-Jian Lee,et al.  Determination of 3D human body postures from a single view , 1985, Comput. Vis. Graph. Image Process..

[63]  Jovan Popovic,et al.  Continuous capture of skin deformation , 2003, ACM Trans. Graph..

[64]  Daniel Thalmann,et al.  Interactive modeling of the human musculature , 2001, Proceedings Computer Animation 2001. Fourteenth Conference on Computer Animation (Cat. No.01TH8596).

[65]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[66]  Charlie C. L. Wang,et al.  Virtual human modeling from photographs for garment industry , 2003, Comput. Aided Des..

[67]  Nadia Magnenat-Thalmann,et al.  Automatic modeling of animatable virtual humans - a survey , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[68]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[69]  Adrian Hilton,et al.  Model-based multiple view reconstruction of people , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[70]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[71]  Ioannis A. Kakadiaris,et al.  On the improvement of anthropometry and pose estimation from a single uncalibrated image , 2003, Machine Vision and Applications.

[72]  Jean Ponce,et al.  On computing exact visual hulls of solids bounded by smooth surfaces , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[73]  Daniel Snow,et al.  Determining Generative Models of Objects Under Varying Illumination: Shape and Albedo from Multiple Images Using SVD and Integrability , 1999, International Journal of Computer Vision.

[74]  Takuya Kondo,et al.  Incremental tracking of human actions from multiple views , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[75]  Reinhard Koch,et al.  Nonlinear Body Pose Estimation from Depth Images , 2005, DAGM-Symposium.

[76]  Jean Ponce,et al.  Carved Visual Hulls for Image-Based Modeling , 2006, International Journal of Computer Vision.

[77]  KwangYun Wohn,et al.  3D Body Reconstruction from Photos Based on Range Scan , 2006, Edutainment.

[78]  Michael Isard,et al.  Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation , 2003, NIPS.

[79]  Sebastian Thrun,et al.  The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces , 2004, NIPS.

[80]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[81]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[82]  Nadia Magnenat-Thalmann,et al.  An example-based approach to human body manipulation , 2004, Graph. Model..

[83]  Hans-Peter Seidel,et al.  A Silhouette Based Human Motion Tracking System , 2005 .

[84]  Michael J. Black,et al.  A Quantitative Evaluation of Video-based 3D Person Tracking , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[85]  Pietro Perona,et al.  Shadow Carving , 2001, ICCV.

[86]  Pascal Fua,et al.  Tracking and Modeling People in Video Sequences , 2001, Comput. Vis. Image Underst..

[87]  Ioannis A. Kakadiaris,et al.  Three-Dimensional Human Body Model Acquisition from Multiple Views , 1998, International Journal of Computer Vision.

[88]  Ryan White,et al.  Capturing and animating occluded cloth , 2007, SIGGRAPH 2007.

[89]  Pietro Perona,et al.  3D photography on your desk , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[90]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[91]  William Ribarsky,et al.  Toward Spontaneous Interaction with the Perceptive Workbench , 2000, IEEE Computer Graphics and Applications.

[92]  Michael Beetz,et al.  Accurate Human Motion Capture Using an Ergonomics-Based Anthropometric Human Model , 2008, AMDO.

[93]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[94]  Derek Bradley,et al.  Markerless garment capture , 2008, SIGGRAPH 2008.

[95]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[96]  Robert B. Fisher,et al.  Estimating 3-D rigid body transformations: a comparison of four major algorithms , 1997, Machine Vision and Applications.

[97]  Hans-Peter Seidel,et al.  Marker-less Deformable Mesh Tracking for Human Shape and Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[98]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[99]  Wei Sun,et al.  Virtual people: capturing human models to populate virtual worlds , 1999, Proceedings Computer Animation 1999.

[100]  Paul Debevec Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 2008, SIGGRAPH Classes.

[101]  Paolo Cignoni,et al.  Exploiting mirrors for laser stripe 3D scanning , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[102]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[103]  Andrew Blake,et al.  Surface shape from the deformation of apparent contours , 1992, International Journal of Computer Vision.

[104]  Larry S. Davis,et al.  Estimation of Anthropomeasures from a Single Calibrated Camera , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[105]  J. Challis A procedure for determining rigid body transformation parameters. , 1995, Journal of biomechanics.

[106]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[107]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[108]  Takeo Kanade,et al.  Image-based spatio-temporal modeling and view interpolation of dynamic events , 2005, TOGS.

[109]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[110]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[111]  Edmond Boyer,et al.  Visual Shapes of Silhouette Sets , 2006, 3DPVT.

[112]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[113]  Edilson de Aguiar,et al.  Reconstructing Human Shape and Motion from Multi-View Video , 2005 .

[114]  Nadia Magnenat-Thalmann,et al.  Generating Animatable 3D Virtual Humans from Photographs , 2000, Comput. Graph. Forum.

[115]  Jessica K. Hodgins,et al.  Capturing and animating skin deformation in human motion , 2006, SIGGRAPH 2006.

[116]  Trevor Darrell,et al.  A Bayesian approach to image-based visual hull reconstruction , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[117]  Ming-Hsuan Yang,et al.  Learning Gender with Support Faces , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[118]  Cristian Sminchisescu,et al.  Human Pose Estimation from Silhouettes - A Consistent Approach Using Distance Level Sets , 2002, WSCG.

[119]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[120]  Wayne E. Carlson,et al.  Anatomy-based modeling of the human musculature , 1997, SIGGRAPH.

[121]  Pascal Fua,et al.  Human Shape and Motion Recovery Using Animation Models , 2000 .

[122]  David J. Fleet,et al.  Temporal motion models for monocular and multiview 3D human body tracking , 2006, Comput. Vis. Image Underst..

[123]  Matthew Brand,et al.  Incremental Singular Value Decomposition of Uncertain Data with Missing Values , 2002, ECCV.

[124]  Daniel Thalmann,et al.  Interactive Shape Design Using Metaballs and Splines , 1995 .

[125]  Luc Van Gool,et al.  Full body tracking from multiple views using stochastic sampling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[126]  Michael J. Black,et al.  Predicting 3D People from 2D Pictures , 2006, AMDO.

[127]  Hans-Peter Seidel,et al.  A system for articulated tracking incorporating a clothing model , 2007, Machine Vision and Applications.

[128]  Jakub Segen,et al.  Shadow gestures: 3D hand pose estimation using a single camera , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[129]  Thomas P Andriacchi,et al.  The evolution of methods for the capture of human movement leading to markerless motion capture for biomechanical applications , 2006, Journal of NeuroEngineering and Rehabilitation.

[130]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, SIGGRAPH 2008.

[131]  Jane Wilhelms,et al.  Anatomically based modeling , 1997, SIGGRAPH.

[132]  Roberto Cipolla,et al.  Motion from the frontier of curved surfaces , 1995, Proceedings of IEEE International Conference on Computer Vision.

[133]  Jovan Popović,et al.  Deformation transfer for triangle meshes , 2004, SIGGRAPH 2004.

[134]  C. Cobelli,et al.  A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach , 2006, Annals of Biomedical Engineering.

[135]  Maja J. Mataric,et al.  Markerless kinematic model and motion capture from volume sequences , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[136]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[137]  Dimitris N. Metaxas,et al.  Dynamic 3D Models with Local and Global Deformations: Deformable Superquadrics , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[138]  Nadia Magnenat-Thalmann,et al.  Automatic modeling of virtual humans and body clothing , 2004, Journal of Computer Science and Technology.

[139]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[140]  Henrique S. Malvar,et al.  High-quality linear interpolation for demosaicing of Bayer-patterned color images , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[141]  Hans-Peter Seidel,et al.  Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[142]  Helen C. Shen,et al.  A 3D Reconstruction System for Human Body Modeling , 1998, CAPTECH.