Sequence learning through PIPE and automatic task decomposition

Analog gradient-based recurrent neural nets can learn complex prediction tasks. Most, however, tend to fail in case of long minimal time lags between relevant training events. On the other hand, discrete methods such as search in a space of event-memorizing programs are not necessarily affected at all by long time lags: we show that discrete "Probabilistic Incremental Program Evolution" (PIPE) can solve several long time lag tasks that have been successfully solved by only one analog method ("Long Short-Term Memory" — LSTM). In fact, sometimes PIPE even outperforms LSTM. Existing discrete methods, however, cannot easily deal with problems whose solutions exhibit comparatively high algorithmic complexity. We overcome this drawback by introducing filtering, a novel, general, data-driven divide-and-conquer technique for automatic task decomposition that is not limited to a particular learning method. We compare PIPE plus filtering to various analog recurrent net methods.