Feedforward Neural Network Methodology
暂无分享,去创建一个
[1] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[2] Roberto Battiti,et al. First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.
[3] Santosh S. Venkatesh,et al. The capacity of the Hopfield associative memory , 1987, IEEE Trans. Inf. Theory.
[4] Masahiko Arai,et al. Bounds on the number of hidden units in binary-valued three-layer neural networks , 1993, Neural Networks.
[5] George Finlay Simmons,et al. Introduction to Topology and Modern Analysis , 1963 .
[6] Stephen I. Gallant,et al. Neural network learning and expert systems , 1993 .
[7] R. Fletcher. Practical Methods of Optimization , 1988 .
[8] J. Baumeister. Stable solution of inverse problems , 1987 .
[9] Eduardo D. Sontag,et al. Shattering All Sets of k Points in General Position Requires (k 1)/2 Parameters , 1997, Neural Computation.
[10] D. Pollard. Convergence of stochastic processes , 1984 .
[11] Lynn Waterhouse,et al. Neurophilosophy: Toward a Unified Science of the Mind/Brain , 1988 .
[12] S. Graubard,et al. The artificial intelligence debate: false starts, real foundations , 1990 .
[13] Saburo Muroga,et al. Threshold logic and its applications , 1971 .
[14] Neil E. Cotter,et al. The CMAC and a theorem of Kolmogorov , 1992, Neural Networks.
[15] Harris Drucker,et al. Boosting and Other Ensemble Methods , 1994, Neural Computation.
[16] David J. Hand,et al. Discrimination and Classification , 1982 .
[17] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.
[18] B. Efron. The jackknife, the bootstrap, and other resampling plans , 1987 .
[19] Shun-ichi Amari,et al. Learning Curves, Model Selection and Complexity of Neural Networks , 1992, NIPS.
[20] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[21] Amir Dembo,et al. On the capacity of associative memories with linear threshold functions , 1989, IEEE Trans. Inf. Theory.
[22] Nils J. Nilsson,et al. The Mathematical Foundations of Learning Machines , 1990 .
[23] Alon Orlitsky,et al. Lower bounds on threshold and related circuits via communication complexity , 1994, IEEE Trans. Inf. Theory.
[24] Eduardo D. Sontag,et al. Feedback Stabilization Using Two-Hidden-Layer Nets , 1991, 1991 American Control Conference.
[25] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[26] G.S. May,et al. Manufacturing ICs the neural way , 1994, IEEE Spectrum.
[27] C. Micchelli,et al. Approximation by superposition of sigmoidal and radial basis functions , 1992 .
[28] Andrew R. Barron,et al. Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.
[29] Neil E. Cotter,et al. The Stone-Weierstrass theorem and its application to neural networks , 1990, IEEE Trans. Neural Networks.
[30] J. Uspensky. Introduction to mathematical probability , 1938 .
[31] Charles A. Micchelli,et al. How to Choose an Activation Function , 1993, NIPS.
[32] A. Barron. Approximation and Estimation Bounds for Artificial Neural Networks , 1991, COLT '91.
[33] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .
[34] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[35] S. M. Carroll,et al. Construction of neural nets using the radon transform , 1989, International 1989 Joint Conference on Neural Networks.
[36] Yong Liu,et al. Neural Network Model Selection Using Asymptotic Jackknife Estimator and Cross-Validation Method , 1992, NIPS.
[37] R. Shibata. Selection of the order of an autoregressive model by Akaike's information criterion , 1976 .
[38] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.
[39] Bernard Widrow,et al. Sensitivity of feedforward neural networks to weight errors , 1990, IEEE Trans. Neural Networks.
[40] C. Fefferman. Reconstructing a neural net from its output , 1994 .
[41] H. Jeffreys. An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.
[42] David J. C. MacKay,et al. Bayesian Model Comparison and Backprop Nets , 1991, NIPS.
[43] Michel Cosnard,et al. Bounds on the Number of Units for Computing Arbitrary Dichotomies by Multilayer Perceptrons , 1994, J. Complex..
[44] Eric B. Baum,et al. On the capabilities of multilayer perceptrons , 1988, J. Complex..
[45] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.
[46] Gregory J. Wolff,et al. Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.
[47] D. Naiman,et al. INCLUSION-EXCLUSION-BONFERRONI IDENTITIES AND INEQUALITIES FOR DISCRETE TUBE-LIKE PROBLEMS VIA EULER CHARACTERISTICS , 1992 .
[48] M. Talagrand. Sharper Bounds for Gaussian and Empirical Processes , 1994 .
[49] C. Darken,et al. Constructive Approximation Rates of Convex Approximation in Non-hilbert Spaces , 2022 .
[50] L. K. Jones,et al. Good weights and hyperbolic kernels for neural networks, projection pursuit, and pattern classification: Fourier strategies for extracting information from high-dimensional data , 1994, IEEE Trans. Inf. Theory.
[51] Terrence L. Fine,et al. Parameter Convergence and Learning Curves for Neural Networks , 1999, Neural Computation.
[52] J. Orbach. Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .
[53] D. G. Stork,et al. Is backpropagation biologically plausible? , 1989, International 1989 Joint Conference on Neural Networks.
[54] R. Fisher. THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS , 1938 .
[55] R. Ash,et al. Real analysis and probability , 1975 .
[56] Luc Devroye,et al. Automatic Pattern Recognition: A Study of the Probability of Error , 1988, IEEE Trans. Pattern Anal. Mach. Intell..
[57] Kenji Fukumizu,et al. A Regularity Condition of the Information Matrix of a Multilayer Perceptron Network , 1996, Neural Networks.
[58] Marcus R. Frean,et al. A "Thermal" Perceptron Learning Rule , 1992, Neural Computation.
[59] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[60] Lawrence D. Jackel,et al. Neural Network Applications in Character Recognition and Document Analysis , 1994 .
[61] Eduardo D. Sontag,et al. Neural Networks with Quadratic VC Dimension , 1995, J. Comput. Syst. Sci..
[62] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[63] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[64] Edoardo Amaldi,et al. From finding maximum feasible subsystems of linear systems to feedforward neural network design , 1994 .
[65] Hans Ulrich Simon,et al. Robust Trainability of Single Neurons , 1995, J. Comput. Syst. Sci..
[66] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[67] Adam Kowalczyk,et al. Estimates of Storage Capacity of Multilayer Perceptron with Threshold Logic Hidden Units , 1997, Neural Networks.
[68] Isabelle Guyon,et al. What Size Test Set Gives Good Error Rate Estimates? , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[69] Andrew R. Barron,et al. Complexity Regularization with Application to Artificial Neural Networks , 1991 .
[70] Hidefumi Katsuura,et al. Computational aspects of Kolmogorov's superposition theorem , 1994, Neural Networks.
[71] W. Pitts,et al. A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.
[72] Pascal Koiran,et al. On the complexity of approximating mappings using feedforward networks , 1993, Neural Networks.
[73] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[74] Gerhard Paass,et al. Assessing and Improving Neural Network Predictions by the Bootstrap Algorithm , 1992, NIPS.
[75] H. Akaike. A new look at the statistical model identification , 1974 .
[76] D. Aldous. Probability Approximations via the Poisson Clumping Heuristic , 1988 .
[77] Mihalis Yannakakis,et al. Simple Local Search Problems That are Hard to Solve , 1991, SIAM J. Comput..
[78] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[79] Thomas Kailath,et al. Classification of linearly nonseparable patterns by linear threshold elements , 1995, IEEE Trans. Neural Networks.
[80] R. Shibata. Asymptotically Efficient Selection of the Order of the Model for Estimating Parameters of a Linear Process , 1980 .
[81] Michael Kearns,et al. A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split , 1995, Neural Computation.
[82] Tadashi Shibata,et al. Neuron-MOS Temporal Winner Search Hardware for Fully-Parallel Data Processing , 1995, NIPS.
[83] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[84] John J. Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities , 1999 .
[85] Feller William,et al. An Introduction To Probability Theory And Its Applications , 1950 .
[86] L. Jones. Constructive approximations for neural networks by sigmoidal functions , 1990, Proc. IEEE.
[87] Michel Loève,et al. Probability Theory I , 1977 .
[88] R. DeVore,et al. Optimal nonlinear approximation , 1989 .
[89] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.
[90] Hava T. Siegelmann,et al. On the complexity of training neural networks with continuous activation functions , 1995, IEEE Trans. Neural Networks.
[91] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[92] Martin E. Dyer,et al. A Random Polynomial Time Algorithm for Approximating the Volume of Convex Bodies , 1989, STOC.
[93] Eduardo D. Sontag,et al. UNIQUENESS OF WEIGHTS FOR NEURAL NETWORKS , 1993 .
[94] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[95] Vijay Balasubramanian,et al. Statistical Inference, Occam's Razor, and Statistical Mechanics on the Space of Probability Distributions , 1996, Neural Computation.
[96] John E. Dennis,et al. Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.
[97] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[98] Ali A. Minai,et al. Perturbation response in feedforward networks , 1994, Neural Networks.
[99] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[100] G. Lorentz. Approximation of Functions , 1966 .
[101] Thomas M. Cover,et al. Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..
[102] Peter Müller,et al. Issues in Bayesian Analysis of Neural Network Models , 1998, Neural Computation.
[103] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[104] Terrence L. Fine,et al. Forecasting Demand for Electric Power , 1992, NIPS.
[105] Paul C. Kainen,et al. Functionally Equivalent Feedforward Neural Networks , 1994, Neural Computation.
[106] F. Vallet,et al. Robustness in Multilayer Perceptrons , 1993, Neural Computation.
[107] L. Schumaker. Spline Functions: Basic Theory , 1981 .
[108] Gregory J. Wolff,et al. Optimal Brain Surgeon: Extensions and performance comparisons , 1993, NIPS 1993.
[109] H. D. Block. The perceptron: a model for brain functioning. I , 1962 .
[110] D. Haussler,et al. Rigorous learning curve bounds from statistical mechanics , 1996 .
[111] D. Pollard. Empirical Processes: Theory and Applications , 1990 .
[112] J. Stephen Judd,et al. Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.
[113] Shun-ichi Amari,et al. A universal theorem on learning curves , 1993, Neural Networks.
[114] C. O'Cinneide. The Mean is within One Standard Deviation of Any Median , 1990 .
[115] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.
[116] D. Mackay,et al. Bayesian methods for adaptive models , 1992 .
[117] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[118] Marek Karpinski,et al. Polynomial bounds for VC dimension of sigmoidal neural networks , 1995, STOC '95.
[119] Jorma Rissanen,et al. Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.
[120] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[121] Terrence L. Fine,et al. Asymptotics of Gradient-based Neural Network Training Algorithms , 1994, NIPS.
[122] Thomas Kailath,et al. On the Perceptron Learning Algorithm on Data with High Precision , 1994, J. Comput. Syst. Sci..
[123] Terrence L. Fine,et al. Assessing generalization of feedforward neural networks , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.
[124] Wray L. Buntine,et al. Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.
[125] L. K. Jones,et al. The computational intractability of training sigmoidal neural networks , 1997, IEEE Trans. Inf. Theory.
[126] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[127] Marcus Frean,et al. The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.
[128] Martin Anthony,et al. Computational Learning Theory , 1992 .
[129] S. E. Decatur,et al. Application of neural networks to terrain classification , 1989, International 1989 Joint Conference on Neural Networks.
[130] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[131] Héctor J. Sussmann,et al. Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.
[132] Robert H. Dodier,et al. Geometry of Early Stopping in Linear Networks , 1995, NIPS.
[133] L. Jones. A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .
[134] Mohammad Bagher Menhaj,et al. Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.
[135] M. Stone. Asymptotics for and against cross-validation , 1977 .
[136] David A. Sprecher,et al. A universal mapping for kolmogorov's superposition theorem , 1993, Neural Networks.
[137] Jack D. Cowan,et al. Neural Networks: The Early Days , 1989, NIPS.
[138] M.H. Hassoun,et al. Fundamentals of Artificial Neural Networks , 1996, Proceedings of the IEEE.
[139] C. Micchelli,et al. Some remarks on ridge functions , 1987 .
[140] Takeo Kanade,et al. Human Face Detection in Visual Scenes , 1995, NIPS.
[141] Hrushikesh Narhar Mhaskar,et al. Approximation properties of a multilayered feedforward artificial neural network , 1993, Adv. Comput. Math..
[142] Peter Auer,et al. Exponentially many local minima for single neurons , 1995, NIPS.
[143] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[144] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .
[145] Massimiliano Pontil,et al. Properties of Support Vector Machines , 1998, Neural Computation.
[146] John J. Shynk,et al. Stationary points of a single-layer perceptron for nonseparable data models , 1993, Neural Networks.
[147] Marvin Minsky,et al. Perceptrons: expanded edition , 1988 .
[148] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.
[149] Martin Fodslette Meiller. A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .
[150] H. D. Block,et al. Analysis of a Four-Layer Series-Coupled Perceptron. II , 1962 .
[151] Richard M. Dudley,et al. Some special vapnik-chervonenkis classes , 1981, Discret. Math..
[152] Klaus-Robert Müller,et al. Asymptotic statistical theory of overtraining and cross-validation , 1997, IEEE Trans. Neural Networks.
[153] Ker-Chau Li,et al. Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set , 1987 .
[154] Eytan Domany,et al. Learning by Choice of Internal Representations , 1988, Complex Syst..
[155] Ron Shonkwiler,et al. Separating the vertices of N-cubes by hyperplanes and its application to artificial neural networks , 1993, IEEE Trans. Neural Networks.
[156] D. M. Y. Sommerville,et al. An Introduction to The Geometry of N Dimensions , 2022 .
[157] Alberto L. Sangiovanni-Vincentelli,et al. Efficient Parallel Learning Algorithms for Neural Networks , 1988, NIPS.
[158] John F. Kolen,et al. Backpropagation is Sensitive to Initial Conditions , 1990, Complex Syst..
[159] Jorma Rissanen,et al. Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.
[160] Robert E. Schapire,et al. Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.
[161] Paul M. B. Vitányi,et al. An Introduction to Kolmogorov Complexity and Its Applications , 1997, Graduate Texts in Computer Science.
[162] P. Halmos. Finite-Dimensional Vector Spaces , 1960 .
[163] Gavin J. Gibson,et al. Exact Classification with Two-Layer Neural Nets , 1996, J. Comput. Syst. Sci..
[164] Terrence L. Fine,et al. Sample Size Requirements for Feedforward Neural Networks , 1994, NIPS.
[165] Barak A. Pearlmutter,et al. Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.
[166] Vera Kurková,et al. Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.
[167] Wolfgang Maass,et al. Neural Nets with Superlinear VC-Dimension , 1994, Neural Computation.
[168] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[169] Richard P. Brent,et al. Fast training algorithms for multilayer neural nets , 1991, IEEE Trans. Neural Networks.
[170] John E. Moody,et al. Towards Faster Stochastic Gradient Search , 1991, NIPS.
[171] Gérard Dreyfus,et al. Handwritten digit recognition by neural networks with single-layer training , 1992, IEEE Trans. Neural Networks.
[172] Kurt Hornik,et al. Degree of Approximation Results for Feedforward Networks Approximating Unknown Mappings and Their Derivatives , 1994, Neural Computation.
[173] Isabelle Guyon,et al. Design of a neural network character recognizer for a touch terminal , 1991, Pattern Recognit..
[174] Norbert Sauer,et al. On the Density of Families of Sets , 1972, J. Comb. Theory, Ser. A.
[175] J. Rissanen. Stochastic Complexity and Modeling , 1986 .
[176] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[177] Charles Fefferman,et al. Recovering a Feed-Forward Net From Its Output , 1993, NIPS.
[178] Rich Caruana,et al. Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.
[179] Eduardo D. Sontag,et al. Feedforward Nets for Interpolation and Classification , 1992, J. Comput. Syst. Sci..
[180] Andrew W. Moore,et al. Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.
[181] Noga Alon,et al. Scale-sensitive dimensions, uniform convergence, and learnability , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.
[182] John C. Platt,et al. A Neural Network Classifier for the I100 OCR Chip , 1995, NIPS.
[183] P. Hall. The Bootstrap and Edgeworth Expansion , 1992 .
[184] G. McLachlan. Discriminant Analysis and Statistical Pattern Recognition , 1992 .
[185] James O. Berger,et al. Statistical Decision Theory and Bayesian Analysis, Second Edition , 1985 .
[186] T. Kailath,et al. Discrete Neural Computation: A Theoretical Foundation , 1995 .
[187] Sherif Hashem,et al. Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.
[188] Yann LeCun,et al. Transforming Neural-Net Output Levels to Probability Distributions , 1990, NIPS.
[189] J J Hopfield,et al. Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.
[190] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[191] Ali A. Minai,et al. On the derivatives of the sigmoid , 1993, Neural Networks.
[192] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[193] E. Littmann. Generalization Abilities of Cascade Network Architectures , 1992 .
[194] Eric B. Baum,et al. The Perceptron Algorithm is Fast for Nonmalicious Distributions , 1990, Neural Computation.
[195] Isabelle Guyon,et al. Structural Risk Minimization for Character Recognition , 1991, NIPS.
[196] N. Wiener. The Fourier Integral: and certain of its Applications , 1933, Nature.
[197] H. White. Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models , 1989 .
[198] Stephen I. Gallant,et al. Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.