Adaptive history compression for learning to divide and conquer

An attempt is made to determine how a system can learn to reduce the descriptions of event sequences without losing information. It is shown that the learning system ought to concentrate on unexpected inputs and ignore expected ones. This insight leads to the construction of neural systems which learn to 'divide and conquer' by recursively composing sequences. The first system creates a self-organizing multilevel hierarchy of recurrent predictors. The second system involves only two recurrent networks: it tries to collapse a multi level predictor hierarchy into a single recurrent net. Experiments show that the system can require less computation per time step and much fewer training sequences than the conventional training algorithms for recurrent nets.<<ETX>>