Evolutionary computation versus reinforcement learning
暂无分享,去创建一个
[1] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[2] Pattie Maes,et al. Emergent Hierarchical Control Structures: Learning Reactive/Hierarchical Relationships in Reinforcement Environments , 1996 .
[3] Thomas S. Ray,et al. An Approach to the Synthesis of Life , 1991 .
[4] Corso Elvezia. A General Method for Incremental Self-improvement and Multi-agent Learning in Unrestricted Environments , 1996 .
[5] Rafal Salustowicz,et al. Probabilistic Incremental Program Evolution , 1997, Evolutionary Computation.
[6] Corso Elvezia. Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability , 1995 .
[7] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[8] Balaraman Ravindran,et al. Improved Switching among Temporally Abstract Actions , 1998, NIPS.
[9] Mark B. Ring. Learning Sequential Tasks by Incrementally Adding Higher Orders , 1992, NIPS.
[10] Corso Elvezia. Probabilistic Incremental Program Evolution , 1997 .
[11] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[12] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[13] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[14] Ray J. Solomonoff,et al. The Application of Algorithmic Probability to Problems in Artificial Intelligence , 1985, UAI.
[15] Maja J. Matarić,et al. Action Selection methods using Reinforcement Learning , 1996 .
[16] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[17] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .
[18] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[19] Wolfgang Banzhaf,et al. Genetic Programming: An Introduction , 1997 .
[20] Jieyu Zhao,et al. Simple Principles of Metalearning , 1996 .
[21] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[22] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[23] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[24] Andrew McCallum,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[25] Satinder Singh. The Ecient Learning of Multiple Task Sequences , 1992 .
[26] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[27] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[28] Chen K. Tham,et al. Reinforcement learning of multiple tasks using a hierarchical CMAC architecture , 1995, Robotics Auton. Syst..
[29] Jieyu Zhao,et al. Direct Policy Search and Uncertain Policy Evaluation , 1998 .
[30] Jürgen Schmidhuber,et al. Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.
[31] Douglas B. Lenat,et al. Theory Formation by Heuristic Search , 1983, Artificial Intelligence.
[32] Mark Humphreys,et al. Action selection methods using reinforcement learning , 1997 .
[33] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences: statistical considerations , 1969, JACM.
[34] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[35] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[36] Juergen Schmidhuber,et al. A General Method For Incremental Self-Improvement And Multi-Agent Learning In Unrestricted Environme , 1999 .
[37] Astro Teller,et al. The evolution of mental models , 1994 .
[38] Nichael Lynn Cramer,et al. A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.
[39] Ron Sun,et al. Self-segmentation of sequences: automatic formation of hierarchies of sequential behaviors , 2000, IEEE Trans. Syst. Man Cybern. Part B.
[40] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..
[41] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[42] Jürgen Schmidhuber,et al. Learning to generate sub-goals for action sequences , 1991 .
[43] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[44] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[45] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .
[46] Jürgen Schmidhuber,et al. Solving POMDPs with Levin Search and EIRA , 1996, ICML.