Vapnik-Chervonenkis Dimension Bounds for Two- and Three-Layer Networks

We show that the Vapnik-Chervonenkis dimension of the class of functions that can be computed by arbitrary two-layer or some completely connected three-layer threshold networks with real inputs is at least linear in the number of weights in the network. In Valiant's "probably approximately correct" learning framework, this implies that the number of random training examples necessary for learning in these networks is at least linear in the number of weights.