Increasingly Cautious Optimism for Practical PAC-MDP Exploration
暂无分享,去创建一个
Xin Yao | Ke Tang | Liangpeng Zhang | X. Yao | K. Tang | Liangpeng Zhang
[1] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[2] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[3] Csaba Szepesvári,et al. Model-based reinforcement learning with nearly tight exploration complexity bounds , 2010, ICML.
[4] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[5] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[6] Paul Bourgine,et al. Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty , 1999, Machine Learning.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] R. Lathe. Phd by thesis , 1988, Nature.
[9] Anthony G. Cohn,et al. Proceedings of the 20th national conference on Artificial intelligence - Volume 1 , 2005, AAAI 2005.
[10] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[11] Michael L. Littman,et al. An empirical evaluation of interval estimation for Markov decision processes , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.
[12] Lawrence Birnbaum,et al. Proceedings of the eighth international workshop on Machine learning , 1991 .
[13] J. van Leeuwen,et al. Theoretical Computer Science , 2003, Lecture Notes in Computer Science.
[14] Lihong Li,et al. Sample Complexity Bounds of Exploration , 2012, Reinforcement Learning.
[15] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.
[16] William W. Cohen,et al. Proceedings of the 23rd international conference on Machine learning , 2006, ICML 2008.
[17] Zoubin Ghahramani,et al. Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.
[18] Tor Lattimore,et al. Near-optimal PAC bounds for discounted MDPs , 2014, Theor. Comput. Sci..
[19] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[20] Shimon Whiteson,et al. V-MAX: tempered optimism for better PAC reinforcement learning , 2012, AAMAS.
[21] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .
[22] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.
[23] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[24] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[25] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.
[26] András Lörincz,et al. The many faces of optimism: a unifying approach , 2008, ICML '08.
[27] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[28] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.
[29] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[30] Stewart W. Wilson,et al. From Animals to Animats 5. Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 1997 .
[31] Andrew G. Barto,et al. Reinforcement learning , 1998 .