Guest Editorial: State of the Art in Image- and Video-Based Human Pose and Motion Estimation

Estimation of articulated human pose from images or video has been a long standing goal of computer vision. Applications for such technology are prevalent across scientific and consumer domains. Despite over a decade of active research, and hundreds of papers, robust solutions to this problem are still unattainable in general unconstrained scenarios. We believe, that progress requires an understanding of the current state of the art and where current methods fail. This necessitates a comprehensive evaluation of current approaches on the same data and with the same metrics. This special pair of issues contain 9 papers that span, and are representative of, recent developments in the field. Among these papers are those that explore algorithmic choices within more traditional methods (e.g., Bayesian Filtering), as well as those that explore new technologies and representations. The topics span kinematic and physicsbased motion priors, generative and discriminative inference methods, part-based models, and parametrized mesh-based body representations. Issues pertaining to the relationship between the number (or speed) of cameras and performance are also explored. To ensure that the methods are directly comparable, we required all authors to report performance on the same datasets (HUMANEVA) and with the same metrics. HUMANEVA is a publicly available dataset initially released for a series of workshops held in conjunction with NIPS (2006) and CVPR (2007). Neither of the two workshops had printed proceedings1but those participating were

[1]  Cristian Sminchisescu,et al.  Structured output-associative regression , 2009, CVPR.

[2]  Svetha Venkatesh,et al.  A Study on Smoothing for Particle-Filtered 3D Human Body Tracking , 2010, International Journal of Computer Vision.

[3]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Nicholas R. Howe Evaluating Recognition-Based Motion Capture on HumanEva II Test Data , 2008 .

[5]  Ronald Poppe,et al.  Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets , 2007 .

[6]  Ahmed M. Elgammal,et al.  Coupled Visual and Kinematic Manifold Models for Tracking , 2010, International Journal of Computer Vision.

[7]  Rui Li,et al.  3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers , 2009, International Journal of Computer Vision.

[8]  Hans-Peter Seidel,et al.  Optimization and Filtering for Human Motion Capture , 2010, International Journal of Computer Vision.

[9]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[10]  Emiliano Gambaretto,et al.  Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation , 2010, International Journal of Computer Vision.

[11]  Cristian Sminchisescu,et al.  Twin Gaussian Processes for Structured Prediction , 2010, International Journal of Computer Vision.

[12]  Cristian Sminchisescu,et al.  Fast algorithms for large scale conditional 3D prediction , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Dima Damen,et al.  Computer Vision and Pattern Recognition (CVPR) , 2009 .

[14]  David J. Fleet,et al.  Physics-Based Person Tracking Using the Anthropomorphic Walker , 2010, International Journal of Computer Vision.

[15]  Liefeng Bo,et al.  Structured output-associative regression , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Christoph Schnörr,et al.  A Study of Parts-Based Object Class Detection Using Complete Graphs , 2010, International Journal of Computer Vision.