Cascaded redundancy reduction.

We describe a method for incrementally constructing a hierarchical generative model of an ensemble of binary data vectors. The model is composed of stochastic, binary, logistic units. Hidden units are added to the model one at a time with the goal of minimizing the information required to describe the data vectors using the model. In addition to the top-down generative weights that define the model, there are bottom-up recognition weights that determine the binary states of the hidden units given a data vector. Even though the stochastic generative model can produce each data vector in many ways, the recognition model is forced to pick just one of these ways. The recognition model therefore underestimates the ability of the generative model to predict the data, but this underestimation greatly simplifies the process of searching for the generative and recognition weights of a new hidden unit.