Generalization and network design strategies

An interestmg property of connectiomst systems is their ability to learn from examples. Although most recent work in the field concentrates on reducing learning times, the most important feature of a learning ma chine is its generalization performance. It is usually accepted that good generalization performance on real-world problems cannot be achieved unless some a pnon knowledge about the task is butlt Into the system. Back-propagation networks provide a way of specifymg such knowledge by imposing constraints both on the architecture of the network and on its weights. In general, such constramts can be considered as particular transformations of the parameter space Building a constramed network for image recogmtton appears to be a feasible task. We descnbe a small handwritten digit recogmtion problem and show that, even though the problem is linearly separable, single layer networks exhibit poor generalizatton performance. Multtlayer constrained networks perform very well on this task when orgamzed in a hierarchical structure with shift invariant feature detectors. These results confirm the idea that minimizing the number of free parameters in the network enhances generalization.