Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. Low-Complexity Art , 2017 .
[2] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.
[3] Jieyu Zhao,et al. Simple Principles of Metalearning , 1996 .
[4] Pattie Maes,et al. Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .
[5] Jürgen Schmidhuber,et al. Solving POMDPs with Levin Search and EIRA , 1996, ICML.
[6] Wolfgang J. Paul,et al. Autonomous theory building systems , 1995, Ann. Oper. Res..
[7] K. Siu,et al. Theoretical Advances in Neural Computation and Learning , 1994, Springer US.
[8] Jürgen Schmidhuber,et al. Simplifying Neural Nets by Discovering Flat Minima , 1994, NIPS.
[9] Wolfgang Maass,et al. Perspectives of Current Research about the Complexity of Learning on Neural Nets , 1994 .
[10] Geoffrey E. Hinton,et al. Keeping Neural Networks Simple , 1993 .
[11] Gustavo Deco,et al. Elimination of Overtraining by a Mutual Information Network , 1993 .
[12] J. Schmidhuber. Reducing the Ratio Between Learning Complexity and Number of Time Varying Variables in Fully Recurrent Nets , 1993 .
[13] Jürgen Schmidhuber,et al. A ‘Self-Referential’ Weight Matrix , 1993 .
[14] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[15] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[16] Osamu Watanabe,et al. Kolmogorov Complexity and Computational Complexity , 2012, EATCS Monographs on Theoretical Computer Science.
[17] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[18] Jürgen Schmidhuber,et al. Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.
[19] Zhaoping Li,et al. Understanding Retinal Color Coding from First Principles , 1992, Neural Computation.
[20] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.
[21] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.
[22] E. Allender. Applications of Time-Bounded Kolmogorov Complexity in Complexity Theory , 1992 .
[23] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[24] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[25] Isabelle Guyon,et al. Structural Risk Minimization for Character Recognition , 1991, NIPS.
[26] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[27] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[28] Andrew R. Barron,et al. Complexity Regularization with Application to Artificial Neural Networks , 1991 .
[29] Suzanna Becker,et al. Unsupervised Learning Procedures for Neural Networks , 1991, Int. J. Neural Syst..
[30] L. N. Kanal,et al. Uncertainty in Artificial Intelligence 5 , 1990 .
[31] Barak A. Pearlmutter,et al. Chaitin-Kolmogorov Complexity and Generalization in Neural Networks , 1990, NIPS.
[32] Yann LeCun,et al. Second Order Properties of Error Surfaces: Learning Time and Generalization , 1990, NIPS 1990.
[33] David E. Rumelhart,et al. Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..
[34] Stephen I. Gallant,et al. A connectionist learning algorithm with provable generalization and scaling bounds , 1990, Neural Networks.
[35] Thomas G. Dietterich. Limitations on Inductive Learning , 1989, ML.
[36] Ming Li,et al. A theory of learning simple concepts under simple distributions and average case complexity for the universal distribution , 1989, 30th Annual Symposium on Foundations of Computer Science.
[37] Ming Li,et al. The Minimum Description Length Principle and Its Application to Online Learning of Handprinted Characters , 1989, IJCAI.
[38] Edwin P. D. Pednault,et al. Some Experiments in Applying Inductive Inference Principles to Surface Reconstruction , 1989, IJCAI.
[39] P. Gács,et al. KOLMOGOROV'S CONTRIBUTIONS TO INFORMATION THEORY AND ALGORITHMIC COMPLEXITY , 1989 .
[40] Ronald L. Rivest,et al. Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..
[41] C. Watkins. Learning from delayed rewards , 1989 .
[42] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[43] Jürgen Schmidhuber,et al. A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .
[44] Charles H. Bennett. Logical depth and physical complexity , 1988 .
[45] David Haussler,et al. Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..
[46] Ralph Linsker,et al. Self-organization in a perceptual network , 1988, Computer.
[47] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[48] Gregory. J. Chaitin,et al. Algorithmic information theory , 1987, Cambridge tracts in theoretical computer science.
[49] David Haussler,et al. Occam's Razor , 1987, Inf. Process. Lett..
[50] J. Rissanen. Stochastic Complexity and Modeling , 1986 .
[51] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[52] Ray J. Solomonoff,et al. The Application of Algorithmic Probability to Problems in Artificial Intelligence , 1985, UAI.
[53] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[54] Leonid A. Levin,et al. Randomness Conservation Inequalities; Information and Independence in Mathematical Theories , 1984, Inf. Control..
[55] Paul E. Utgoff,et al. Shift of bias for inductive concept learning , 1984 .
[56] Juris Hartmanis,et al. Generalized Kolmogorov complexity and the structure of feasible computations , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).
[57] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[58] J. Rissanen. A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .
[59] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..
[60] G. Chaitin. Algorithmic Information Theory , 1987, IBM J. Res. Dev..
[61] G. Chaitin. A Theory of Program Size Formally Identical to Information Theory , 1975, JACM.
[62] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[63] L. Levin,et al. THE COMPLEXITY OF FINITE OBJECTS AND THE DEVELOPMENT OF THE CONCEPTS OF INFORMATION AND RANDOMNESS BY MEANS OF THE THEORY OF ALGORITHMS , 1970 .
[64] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences: statistical considerations , 1969, JACM.
[65] C. S. Wallace,et al. An Information Measure for Classification , 1968, Comput. J..
[66] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[67] Per Martin-Löf,et al. The Definition of Random Sequences , 1966, Inf. Control..
[68] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.
[69] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..
[70] A. Kolmogoroff. Grundbegriffe der Wahrscheinlichkeitsrechnung , 1933 .