Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement
暂无分享,去创建一个
[1] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[2] Corso Elvezia,et al. Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability , 1997 .
[3] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..
[4] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[5] Leonid A. Levin,et al. Randomness Conservation Inequalities; Information and Independence in Mathematical Theories , 1984, Inf. Control..
[6] Dave Cliff,et al. Adding Temporary Memory to ZCS , 1994, Adapt. Behav..
[7] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[8] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[9] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[10] K. Narendra,et al. Learning AutomataA Survey , 1974 .
[11] Nichael Lynn Cramer,et al. A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.
[12] Jürgen Schmidhuber. Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability , 1995, ICML.
[13] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences: statistical considerations , 1969, JACM.
[14] William I. Gasarch,et al. Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.
[15] Lawrence J. Fogel,et al. Artificial Intelligence through Simulated Evolution , 1966 .
[16] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[17] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..
[18] Mark S. Boddy,et al. Deliberation Scheduling for Problem Solving in Time-Constrained Environments , 1994, Artif. Intell..
[19] John R. Koza,et al. Genetic evolution and co-evolution of computer programs , 1991 .
[20] Douglas B. Lenat,et al. Theory Formation by Heuristic Search , 1983, Artificial Intelligence.
[21] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[22] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[23] Juergen Schmidhuber,et al. A General Method For Incremental Self-Improvement And Multi-Agent Learning In Unrestricted Environme , 1999 .
[24] Ray J. Solomonoff,et al. The Application of Algorithmic Probability to Problems in Artificial Intelligence , 1985, UAI.
[25] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.
[26] Pattie Maes,et al. Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .
[27] Xu Xin. Reinforcement learning algorithm for partially observable Markov decision processes , 2004 .
[28] Stuart J. Russell,et al. Principles of Metareasoning , 1989, Artif. Intell..
[29] Russell Greiner,et al. PALO: A Probabilistic Hill-Climbing Algorithm , 1996, Artif. Intell..
[30] Donald A. Berry,et al. Bandit Problems: Sequential Allocation of Experiments. , 1986 .
[31] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[32] Kumpati S. Narendra,et al. Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..
[33] Jieyu Zhao,et al. Simple Principles of Metalearning , 1996 .
[34] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[35] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[36] Jürgen Schmidhuber,et al. Solving POMDPs with Levin Search and EIRA , 1996, ICML.
[37] David H. Wolpert,et al. The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.
[38] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[39] Lorien Y. Pratt,et al. A Survey of Transfer Between Connectionist Networks , 1996, Connect. Sci..
[40] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[41] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[42] Juergen Schmidhuber,et al. Incremental self-improvement for life-time multi-agent reinforcement learning , 1996 .
[43] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[44] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[45] Osamu Watanabe,et al. Kolmogorov Complexity and Computational Complexity , 2012, EATCS Monographs on Theoretical Computer Science.
[46] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[47] Juergen Schmidhuber,et al. On learning how to learn learning strategies , 1994 .
[48] Jürgen Schmidhuber,et al. A ‘Self-Referential’ Weight Matrix , 1993 .
[49] Paul E. Utgoff,et al. Shift of bias for inductive concept learning , 1984 .