Dynamic Factor Graphs for Time Series Modeling

This article presents a method for training Dynamic Factor Graphs (DFG) with continuous latent state variables. A DFG includes factors modeling joint probabilities between hidden and observed variables, and factors modeling dynamical constraints on hidden variables. The DFG assigns a scalar energy to each configuration of hidden and observed variables. A gradient-based inference procedure finds the minimum-energy state sequence for a given observation sequence. Because the factors are designed to ensure a constant partition function, they can be trained by minimizing the expected energy over training sequences with respect to the factors' parameters. These alternated inference and parameter updates can be seen as a deterministic EM-like procedure. Using smoothing regularizers, DFGs are shown to reconstruct chaotic attractors and to separate a mixture of independent oscillatory sources perfectly. DFGs outperform the best known algorithm on the CATS competition benchmark for time series prediction. DFGs also successfully reconstruct missing motion capture data.

[1]  E. Lorenz Deterministic nonperiodic flow , 1963 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  F. Takens Detecting strange attractors in turbulence , 1981 .

[4]  Esther Levin Hidden control neural architecture modeling of nonlinear time varying systems and its applications , 1993, IEEE Trans. Neural Networks.

[5]  Eric A. Wan,et al.  Time series prediction by using a connectionist network with internal delay lines , 1993 .

[6]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[7]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[8]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[9]  Yves Chauvin,et al.  Backpropagation: theory, architectures, and applications , 1995 .

[10]  Eric A. Wan,et al.  Dual Kalman Filtering Methods for Nonlinear Prediction, Smoothing and Estimation , 1996, NIPS.

[11]  Gunnar Rätsch,et al.  Using support vector machines for time series prediction , 1999 .

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Zoubin Ghahramani,et al.  Learning Nonlinear Dynamical Systems Using an EM Algorithm , 1998, NIPS.

[14]  Simon Haykin,et al.  Support vector machines for dynamic reconstruction of a chaotic system , 1999 .

[15]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[16]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[17]  David Barber,et al.  Dynamic Bayesian Networks with Deterministic Latent Tables , 2002, NIPS.

[18]  Lutgarde M. C. Buydens,et al.  Using support vector machines for time series prediction , 2003 .

[19]  Erkki Oja,et al.  Nonlinear dynamical factor analysis for state change detection , 2004, IEEE Transactions on Neural Networks.

[20]  David J. Fleet,et al.  Gaussian Process Dynamical Models , 2005, NIPS.

[21]  Jürgen Schmidhuber,et al.  Modeling systems with internal state using evolino , 2005, GECCO '05.

[22]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[23]  Erkki Oja,et al.  Time series prediction competition: The CATS benchmark , 2007, Neurocomputing.