An information-theoretic approach to curiosity-driven reinforcement learning
暂无分享,去创建一个
[1] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[2] Robert Shaw,et al. The Dripping Faucet As A Model Chaotic System , 1984 .
[3] C. Watkins. Learning from delayed rewards , 1989 .
[4] Rose,et al. Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.
[5] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[6] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[7] Naftali Tishby,et al. Distributional Clustering of English Words , 1993, ACL.
[8] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[9] K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems , 1998, Proc. IEEE.
[10] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[11] James P. Crutchfield,et al. Synchronizing to the Environment: Information-Theoretic Constraints on Agent Learning , 2001, Adv. Complex Syst..
[12] A. U.S.,et al. Predictability , Complexity , and Learning , 2002 .
[13] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[14] Doina Precup,et al. Using MDP Characteristics to Guide Exploration in Reinforcement Learning , 2003, ECML.
[15] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[16] Gal Chechik,et al. Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..
[17] William Bialek,et al. Optimal Manifold Representation of Data: An Information Theoretic Approach , 2003, NIPS.
[18] William Bialek,et al. Geometric Clustering Using the Information Bottleneck Method , 2003, NIPS.
[19] J. Crutchfield,et al. Regularities unseen, randomness observed: levels of entropy convergence. , 2001, Chaos.
[20] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[21] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[22] William Bialek,et al. How Many Clusters? An Information-Theoretic Perspective , 2003, Neural Computation.
[23] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[24] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[25] Lihong Li,et al. Incremental Model-based Learners With Formal Learning-Time Guarantees , 2006, UAI.
[26] Satinder P. Singh,et al. On discovery and learning of models with predictive representations of state for agents with continuous actions and observations , 2007, AAMAS '07.
[27] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[28] Ralf Der,et al. Predictive information and explorative behavior of autonomous robots , 2008 .
[29] KasabovNikola,et al. 2008 Special issue , 2008 .
[30] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[31] J. Schmidhuber. Science as By-Products of Search for Novel Patterns , or Data Compressible in Unknown Yet Learnable Ways , 2009 .
[32] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[33] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[34] Susanne Still,et al. Information-theoretic approach to interactive learning , 2007, 0709.1948.
[35] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[36] Friedrich T. Sommer,et al. Learning in embodied action-perception loops through exploration , 2011, ArXiv.
[37] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[38] Hilbert J. Kappen,et al. Dynamic policy programming , 2010, J. Mach. Learn. Res..