Adaptive Soft Weight Tying using Gaussian Mixtures

Geoffrey E. Hinton Department of Computer Science . U ni versi ty of Toran to Toronto, Canada M5S lA4 One way of simplifying neural networks so they generalize better is to add an extra t.erm 10 the error fUll c tion that will penalize complexit.y. \Ve propose a new penalt.y t.erm in which the dist rihution of weight values is modelled as a mixture of multiple gaussians . C nder this model, a set of weights is simple if the weights can be clustered into subsets so that the weights in each cluster have similar values . We allow the parameters of the mixture model to adapt at t.he same time as t.he network learns. Simulations demonstrate that this complexity term is more effective than previous complexity terms.