A Two-Step Methodology for Human Pose Estimation Increasing the Accuracy and Reducing the Amount of Learning Samples Dramatically

In this paper, we present a two-step methodology to improve existing human pose estimation methods from a single depth image. Instead of learning the direct mapping from the depth image to the 3D pose, we first estimate the orientation of the standing person seen by the camera and then use this information to dynamically select a pose estimation model suited for this particular orientation. We evaluated our method on a public dataset of realistic depth images with precise ground truth joints location. Our experiments show that our method decreases the error of a state-of-the-art pose estimation method by \(30\%\), or reduces the size of the needed learning set by a factor larger than 10.

[1]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[2]  James J. Little,et al.  Real-Time Human Motion Capture with Multiple Depth Cameras , 2016, 2016 13th Conference on Computer and Robot Vision (CRV).

[3]  Mandy Eberhart,et al.  Decision Forests For Computer Vision And Medical Image Analysis , 2016 .

[4]  Sebastian Thrun,et al.  Real-Time Human Pose Tracking from Range Data , 2012, ECCV.

[5]  Jinxiang Chai,et al.  Accurate realtime full-body motion capture using a single depth camera , 2012, ACM Trans. Graph..

[6]  Marc Van Droogenbroeck,et al.  Leveraging Orientation Knowledge to Enhance Human Pose Estimation Methods , 2016, AMDO.

[7]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[10]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Michael J. Black,et al.  Guest Editorial: State of the Art in Image- and Video-Based Human Pose and Motion Estimation , 2009, International Journal of Computer Vision.

[13]  Emiliano Gambaretto,et al.  Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation , 2010, International Journal of Computer Vision.

[14]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[15]  Ho Yub Jung,et al.  Random tree walk toward instantaneous 3D human pose estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[17]  Marc Van Droogenbroeck,et al.  Estimation of Human Orientation in Images Captured with a Range Camera , 2011, ACIVS.

[18]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[19]  Ho Yub Jung,et al.  A Sequential Approach to 3D Human Pose Estimation: Separation of Localization and Identification of Body Joints , 2016, ECCV.

[20]  D. Roberston,et al.  The Sherwood Software Library , 2013 .

[21]  Ruigang Yang,et al.  Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[24]  Emiliano Gambaretto,et al.  Automatic Generation of a Subject-Specific Model for Accurate Markerless Motion Capture and Biomechanical Applications , 2010, IEEE Transactions on Biomedical Engineering.

[25]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.