Finiteness results for sigmoidal “neural” networks

Proc. 25th Annual Symp. Theory Computing , San Diego, May 1993 This paper deals with analog circuits. It establishes the finiteness of VC dimension, teaching dimension, and several other measures of sample complexity which arise in learning theory. It also shows that the equivalence of behaviors, and the loading problem, are effectively decidable, modulo a widely believed conjecture in number theory. The results, the first ones that are independent of weight size, apply when the gate function is the “standard sigmoid” commonly used in neural networks research. The proofs rely on very recent developments in the elementary theory of real numbers with exponentiation. (Some weaker conclusions are also given for more general analytic gate functions.) Applications to learnability of sparse polynomials are also mentioned.

[1]  G. Carpenter In Providence, R. I. , 1929 .

[2]  L. Dries A generalization of the Tarski-Seidenberg theorem, and some nondefinability results , 1986 .

[3]  A. E. Khadiri Sur certaines algebres de fonctions analytiques , 1987 .

[4]  E. Bierstone,et al.  Semianalytic and subanalytic sets , 1988 .

[5]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[6]  J. Yukich,et al.  Some new Vapnik-Chervonenkis classes , 1989 .

[7]  David E. Rumelhart,et al.  Product Units: A Computationally Powerful and Biologically Plausible Extension to Backpropagation Networks , 1989, Neural Computation.

[8]  Eduardo D. Sontag,et al.  Mathematical control theory: deterministic systems , 1990 .

[9]  H. Sussmann Real analytic desingularization and subanalytic sets: an elementary approach , 1990 .

[10]  Eduardo D. Sontag,et al.  Mathematical Control Theory: Deterministic Finite Dimensional Systems , 1990 .

[11]  Michael Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[12]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[13]  Georg Schnitger,et al.  On the computational power of sigmoid versus Boolean threshold circuits , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[14]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[15]  Michael C. Laskowski,et al.  Vapnik-Chervonenkis classes of definable sets , 1992 .

[16]  Georg Schnitger,et al.  The Power of Approximation: A Comparison of Activation Functions , 1992, NIPS.

[17]  Eduardo D. Sontag,et al.  Feedforward Nets for Interpolation and Classification , 1992, J. Comput. Syst. Sci..

[18]  Héctor J. Sussmann,et al.  Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.

[19]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[20]  Paul W. Goldberg,et al.  Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.

[21]  Wolfgang Maass,et al.  Bounds for the computational power and learning complexity of analog neural nets , 1993, SIAM J. Comput..

[22]  Marek Karpinski,et al.  VC Dimension and Uniform Learnability of Sparse Polynomials and Rational Functions , 1993, SIAM J. Comput..

[23]  A. Macintyre,et al.  The Elementary Theory of Restricted Analytic Fields with Exponentiation , 1994 .

[24]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[25]  L. Dries,et al.  On the real exponential field with restricted analytic functions , 1994 .

[26]  L. Dries,et al.  On the real exponential field with restricted analytic functions , 1995 .

[27]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..