Markerless tracking of complex human motions from multiple views

We present a method for markerless tracking of complex human motions from multiple camera views. In the absence of markers, the task of recovering the pose of a person during such motions is challenging and requires strong image features and robust tracking. We propose a solution which integrates multiple image cues such as edges, color information and volumetric reconstruction. We show that a combination of multiple image cues helps the tracker to overcome ambiguous situations such as limbs touching or strong occlusions of body parts. Following a model-based approach, we match an articulated body model built from superellipsoids against these image cues. Stochastic Meta Descent (SMD) optimization is used to find the pose which best matches the images. Stochastic sampling makes SMD robust against local minima and lowers the computational costs as a small set of predicted image features is sufficient for optimization. The power of SMD is demonstrated by comparing it to the commonly used Levenberg-Marquardt method. Results are shown for several challenging sequences showing complex motions and full articulation, with tracking of 24 degrees of freedom in ≈ 1 frame per second.

[1]  Luc Van Gool,et al.  GPU-Based Foreground-Background Segmentation using an Extended Colinearity Criterion , 2005 .

[2]  Til Aach,et al.  Illumination-Invariant Change Detection Using a Statistical Colinearity Criterion , 2001, DAGM-Symposium.

[3]  Dimitris N. Metaxas,et al.  Shape and Nonrigid Motion Estimation Through Physics-Based Synthesis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Edmond Boyer,et al.  Real-Time Capture, Reconstruction and Insertion into Virtual World of Human Actors , 2003, VVG.

[5]  Hans-Peter Seidel,et al.  Enhancing silhouette-based human motion capture with 3D motion fields , 2003, 11th Pacific Conference onComputer Graphics and Applications, 2003. Proceedings..

[6]  Tomás Svoboda,et al.  A Convenient Multicamera Self-Calibration for Virtual Environments , 2005, Presence: Teleoperators & Virtual Environments.

[7]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  Cristian Sminchisescu Consistency and coupling in human model likelihoods , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[9]  Slawomir Wesolkowski,et al.  Color edge detection in RGB using jointly euclidean distance and vector angle , 1999 .

[10]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[11]  Cristian Sminchisescu,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[12]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[13]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[14]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Olivier D. Faugeras,et al.  3D Articulated Models and Multiview Tracking with Physical Forces , 2001, Comput. Vis. Image Underst..

[16]  Stephen J. McKenna,et al.  Online appearance learning for 3D articulated human tracking , 2002, Object recognition supported by user interaction for service robots.

[17]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Adrian Hilton,et al.  Simultaneous Pose Estimation of Multiple People using Multiple-View Cues with Hierarchical Sampling , 2003, BMVC.

[19]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[20]  Luc Van Gool,et al.  Full body tracking from multiple views using stochastic sampling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Mohan M. Trivedi,et al.  Articulated body posture estimation from multi-camera voxel data , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[22]  Ian D. Reid,et al.  Automatic partitioning of high dimensional search spaces associated with articulated body motion capture , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Nicol N. Schraudolph,et al.  Local Gain Adaptation in Stochastic Gradient Descent , 1999 .

[25]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[26]  Henry. Dreyfuss,et al.  The measure of man , 1960 .

[27]  Panos E. Trahanias,et al.  Color edge detection using vector order statistics , 1993, IEEE Trans. Image Process..

[28]  Andrew Zisserman,et al.  Multiple view geometry in computer visiond , 2001 .

[29]  Pascal Fua,et al.  Model-Based Silhouette Extraction for Accurate People Tracking , 2002, ECCV.

[30]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  David Demirdjian Enforcing Constraints for Human Body Tracking , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[32]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[33]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[34]  Barr,et al.  Superquadrics and Angle-Preserving Transformations , 1981, IEEE Computer Graphics and Applications.

[35]  Jack Bresenham,et al.  Algorithm for computer control of a digital plotter , 1965, IBM Syst. J..

[36]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[37]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[38]  Jovan Popovic,et al.  Continuous capture of skin deformation , 2003, ACM Trans. Graph..