Direct Multi-Step Time Series Prediction Using TD-lambda

This paper explores the application of Temporal Difference (TD) learning (Sutton, 1988) to forecasting the behavior of dynamical systems with real-valued outputs (as opposed to game-like situations). The performance of TD learning in comparison to standard supervised learning depends on the amount of noise present in the data. In this paper, we use a deterministic chaotic time series from a low-noise laser. For the task of direct five-step ahead predictions, our experiments show that standard supervised learning is better than TD learning. The TD algorithm can be viewed as linking adjacent predictions. A similar effect can be obtained by sharing the internal representation in the network. We thus compare two architectures for both paradigms: the first architecture ("separate hidden units") consists of individual networks for each of the five direct multi-step prediction tasks; the second ("shared hidden units") has a single (larger) hidden layer that finds a representation from which all five predictions for the next five steps are generated. For this data set we do not find any significant difference between the two architectures.