Learning silhouette features for control of human motion

People, even without training, can move in expressive ways to play charades, pantomime a winning touchdown, or act out a story for a child. The goal of our research is to provide untrained users with a vision-based interface for interactively controlling complex human motions. The insight behind our system is that the domain knowledge from a pre-recorded motion database allows plausible movement to be inferred from the incomplete information obtained from simple vision processing. The experimental scenario we use in this research is a single user animating two people swing dancing. We present a low cost camera-based system (one PC and three video cameras) that allows an untrained user to “drive” the dance of the couple interactively (Figure 1). The user can change the high-level structure of the dance, introducing a different sequence of steps, rotations, or movements. The final sequence retains the quality of previously captured dance yet reflects the user’s higher-level goals and spontaneity. Our approach yields results for more complex behaviors and longer sequences than have been demonstrated in previous silhouette-based systems.

[1]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[2]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[3]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Norman I. Badler,et al.  Production and playback of human figure motion for visual simulation , 1995, TOMC.

[5]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Takuya Kondo,et al.  Incremental tracking of human actions from multiple views , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[7]  Daniel Thalmann,et al.  Sensor-based synthetic actors in a tennis game simulation , 1998, The Visual Computer.

[8]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[9]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[10]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[11]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[12]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[14]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[16]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  Aaron Hertzmann,et al.  Style machines , 2000, SIGGRAPH 2000.

[19]  Mohan M. Trivedi,et al.  Articulated body posture estimation from multi-camera voxel data , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Wojciech Matusik,et al.  Polyhedral Visual Hulls for Real-Time Rendering , 2001, Rendering Techniques.

[21]  Rómer Rosales,et al.  Estimating 3D body pose using uncalibrated cameras , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[22]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[23]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24]  Sung Yong Shin,et al.  Computer puppetry: An importance-based approach , 2001, TOGS.

[25]  Jean-Yves Bouguet,et al.  Camera calibration toolbox for matlab , 2001 .

[26]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[27]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[28]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[29]  Okan Arikan,et al.  Interactive motion generation from examples , 2002, ACM Trans. Graph..

[30]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[31]  Lucas Kovar,et al.  Motion graphs , 2002, SIGGRAPH '08.

[32]  David A. Forsyth,et al.  Motion synthesis from annotations , 2003, ACM Trans. Graph..

[33]  James M. Rehg,et al.  Learning a Rare Event Detection Cascade by Direct Feature Selection , 2003, NIPS.

[34]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[35]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[36]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[37]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[38]  David A. Forsyth,et al.  Automatic Annotation of Everyday Movements , 2003, NIPS.

[39]  Sung Yong Shin,et al.  Rhythmic-motion synthesis based on motion-beat analysis , 2003, ACM Trans. Graph..

[40]  Paul A. Viola,et al.  Face Recognition Using Boosted Local Features , 2003 .

[41]  Dinesh K. Pai,et al.  FootSee: an interactive animation system , 2003, SCA '03.

[42]  Björn Stenger,et al.  Filtering using a tree-based estimator , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[43]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[44]  D. D. Hoffman,et al.  The interpretation of biological motion , 1982, Biological Cybernetics.

[45]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[46]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[47]  Jessica K. Hodgins,et al.  Performance animation from low-dimensional control signals , 2005, ACM Trans. Graph..

[48]  Learning silhouette features for control of human motion , 2005, TOGS.