Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
暂无分享,去创建一个
Peter Dayan | David Silver | Arthur Guez | D. Silver | A. Guez | P. Dayan | David Silver
[1] Robert E. Kalaba,et al. On adaptive control processes , 1959 .
[2] P. Randolph. Bayesian Decision Problems and Markov Chains , 1968 .
[3] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .
[4] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[5] Yoram Singer,et al. Efficient Bayesian Parameter Estimation in Large Discrete Domains , 1998, NIPS.
[6] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[7] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[8] Tamer Basar,et al. Dual Control Theory , 2001 .
[9] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[10] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[11] Paul Bourgine,et al. Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty , 1999, Machine Learning.
[12] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[13] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[14] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[15] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[16] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[17] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[18] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[19] Richard L. Lewis,et al. Variance-Based Rewards for Approximate Bayesian Reinforcement Learning , 2010, UAI.
[20] Doina Precup,et al. Smarter Sampling in Model-Based Bayesian Reinforcement Learning , 2010, ECML/PKDD.
[21] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[22] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[23] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[24] Michael L. Littman,et al. Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search , 2011, UAI.
[25] Joelle Pineau,et al. A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes , 2011, J. Mach. Learn. Res..
[26] Michèle Sebag,et al. The grand challenge of computer Go , 2012, Commun. ACM.
[27] Peter Dayan,et al. Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search , 2013, J. Artif. Intell. Res..