论文信息 - Perpetual Motion: Generating Unbounded Human Motion

Perpetual Motion: Generating Unbounded Human Motion

The modeling of human motion using machine learning methods has been widely studied. In essence it is a time-series modeling problem involving predicting how a person will move in the future given how they moved in the past. Existing methods, however, typically have a short time horizon, predicting a only few frames to a few seconds of human motion. Here we focus on long-term prediction; that is, generating long sequences (potentially infinite) of human motion that is plausible. Furthermore, we do not rely on a long sequence of input motion for conditioning, but rather, can predict how someone will move from as little as a single pose. Such a model has many uses in graphics (video games and crowd animation) and vision (as a prior for human motion estimation or for dataset creation). To address this problem, we propose a model to generate non-deterministic, \textit{ever-changing}, perpetual human motion, in which the global trajectory and the body pose are cross-conditioned. We introduce a novel KL-divergence term with an implicit, unknown, prior. We train this using a heavy-tailed function of the KL divergence of a white-noise Gaussian process, allowing latent sequence temporal dependency. We perform systematic experiments to verify its effectiveness and find that it is superior to baseline methods.

Michael J. Black | Yan Zhang | Siyu Tang | Siyu Tang | Yan Zhang

[1] Quoc V. Le,et al. Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[2] Juan Carlos Niebles,et al. Imitation Learning for Human Pose Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3] Dimitrios Tzionas,et al. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Nikolaus F. Troje,et al. AMASS: Archive of Motion Capture As Surface Shapes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Michael J. Black,et al. MoSh: motion and shape capture from sparse markers , 2014, ACM Trans. Graph..

[6] Dimitrios Tzionas,et al. Resolving 3D Human Pose Ambiguities With 3D Scene Constraints , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7] Otmar Hilliges,et al. Learning Human Motion Models for Long-Term Predictions , 2017, 2017 International Conference on 3D Vision (3DV).

[8] Otmar Hilliges,et al. Structured Prediction Helps 3D Human Motion Modelling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[10] Ole Winther,et al. Ladder Variational Autoencoders , 2016, NIPS.

[11] Michael J. Black,et al. On Human Motion Prediction Using Recurrent Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Yan Zhang,et al. Generating 3D People in Scenes Without People , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] José M. F. Moura,et al. Few-Shot Human Motion Prediction via Meta-learning , 2018, ECCV.

[14] Christian Osendorfer,et al. Learning Stochastic Recurrent Networks , 2014, NIPS 2014.

[15] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.

[16] Francesc Moreno-Noguer,et al. Human Motion Prediction via Spatio-Temporal Inpainting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17] Yi Zhou,et al. On the Continuity of Rotation Representations in Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Brendan Tran Morris,et al. Convolutional Neural Networkfor Trajectory Prediction , 2018, ECCV Workshops.

[19] Dario Pavllo,et al. QuaterNet: A Quaternion-based Recurrent Model for Human Motion , 2018, BMVC.

[20] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[21] Dahua Lin,et al. Convolutional Sequence Generation for Skeleton-Based Action Synthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22] Otmar Hilliges,et al. STCN: Stochastic Temporal Convolutional Networks , 2019, ICLR.

[23] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[24] José M. F. Moura,et al. Adversarial Geometry-Aware Human Motion Prediction , 2018, ECCV.

[25] Dario Pavllo,et al. Modeling Human Motion with Quaternion-Based Neural Networks , 2019, International Journal of Computer Vision.

[26] Zhen Zhang,et al. Convolutional Sequence to Sequence Model for Human Dynamics , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27] Michael J. Black,et al. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[28] Sebastian Starke,et al. Neural state machine for character-scene interactions , 2019, ACM Trans. Graph..

[29] Taku Komura,et al. Phase-functioned neural networks for character control , 2017, ACM Trans. Graph..

[30] Ali Razavi,et al. Preventing Posterior Collapse with delta-VAEs , 2019, ICLR.

[31] Michael J. Black,et al. SMPL: A Skinned Multi-Person Linear Model , 2015, ACM Trans. Graph..

[32] Michel Barlaud,et al. Two deterministic half-quadratic regularization algorithms for computed imaging , 1994, Proceedings of 1st International Conference on Image Processing.

[33] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[34] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.