Variance-Based Rewards for Approximate Bayesian Reinforcement Learning
暂无分享,去创建一个
[1] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[2] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[3] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[4] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[5] Lihong Li,et al. Incremental Model-based Learners With Formal Learning-Time Guarantees , 2006, UAI.
[6] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[7] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[8] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.