Directed Acyclic Graph Neural Network for Human Motion Prediction

Human motion prediction is essential in human-robot interaction. Current research mostly considers the joint dependencies but ignores the bone dependencies and their relationship in the human skeleton, thus limiting the prediction accuracy. To address this issue, we represent the human skeleton as a directed acyclic graph with joints as vertexes and bones as directed edges. Then, we propose a novel directed acyclic graph neural network (DA-GNN) that follows the encoder-decoder structure. The encoder is stacked by multiple encoder blocks, each of which includes a directed acyclic graph computational operator (DA-GCO) to update joint and bone attributes based on the relationship between joint and bone dependencies in the observed human states, and a temporal update operator (TUO) to update the temporal dynamics of joints and bones in the same observation. After progressively implementing the above update process, the encoder outputs the final update result, fed into the decoder. The decoder includes a directed acyclic graph-based gated recurrent unit (DAG-GRU) and a multi-layered perceptron (MLP) to predict future human states sequentially. To the best of our knowledge, this is the first time to introduce the relationship between bone and joint dependencies in human motion prediction. Our experimental evaluations on two datasets, CMU Mocap and Human 3.6m, prove that DA-GNN outperforms current models. Finally, we showcase the efficacy of DA-GNN in a realistic HRI scenario.