Lower bounds on the VC-dimension of smoothly parametrized function classes

We examine the relationship between the VC-dimension and the number of parameters of a smoothly parametrized function class. We show that the VC-dimension of such a function class is at least k if there exists a k-dimensional differentiable manifold in the parameter space such that each member of the manifold corresponds to a different decision boundary. Using this result, we are able to obtain lower bounds on the VC-dimension proportional to the number of parameters for several function classes including two-layer neural networks with certain smooth activation functions and radial basis functions with a gaussian basis. These lower bounds hold even if the magnitudes of the parameters are restricted to be arbitarily small. In Valiant's probably approximately correct learning framework, this implies that the number of example necessary for learning these function classes is at least linear in the number of parameters.

[1]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[2]  J. Munkres,et al.  Calculus on Manifolds , 1965 .

[3]  Paul W. Goldberg,et al.  Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.

[4]  Wolfgang Maass,et al.  Agnostic PAC Learning of Functions on Analog Neural Nets , 1993, Neural Computation.

[5]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[6]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[7]  Peter L. Bartlett,et al.  Lower bounds on the Vapnik-Chervonenkis dimension of multi-layer threshold networks , 1993, COLT '93.

[8]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[9]  A. Sakurai,et al.  Tighter bounds of the VC-dimension of three layer networks , 1993 .

[10]  Héctor J. Sussmann,et al.  Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.

[11]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[12]  Wolfgang Maass,et al.  Neural Nets with Superlinear VC-Dimension , 1994, Neural Computation.

[13]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[14]  R. Dudley Central Limit Theorems for Empirical Measures , 1978 .

[15]  Eduardo D. Sontag,et al.  UNIQUENESS OF WEIGHTS FOR NEURAL NETWORKS , 1993 .

[16]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[17]  Martin Anthony,et al.  On the power of polynomial discriminators and radial basis function networks , 1993, COLT '93.