Nonparametric Bayesian Policy Priors for Reinforcement Learning
暂无分享,去创建一个
Joshua B. Tenenbaum | Nicholas Roy | Finale Doshi-Velez | David Wingate | J. Tenenbaum | N. Roy | Finale Doshi-Velez | D. Wingate | F. Doshi-Velez
[1] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[2] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[3] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[4] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[5] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[6] Carl E. Rasmussen,et al. Factorial Hidden Markov Models , 1997 .
[7] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[8] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[9] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[10] Joelle Pineau,et al. Active Learning in Partially Observable Markov Decision Processes , 2005, ECML.
[11] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[12] Doina Precup,et al. Learning in non-stationary Partially Observable Markov Decision Processes , 2005 .
[13] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[14] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .
[15] Joelle Pineau,et al. Bayes-Adaptive POMDPs , 2007, NIPS.
[16] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.
[17] Yee Whye Teh,et al. Beam sampling for the infinite hidden Markov model , 2008, ICML '08.
[18] Joelle Pineau,et al. Bayesian reinforcement learning in continuous POMDPs with application to robot navigation , 2008, 2008 IEEE International Conference on Robotics and Automation.
[19] Joelle Pineau,et al. Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs , 2008, ICML '08.
[20] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[21] Finale Doshi-Velez,et al. The Infinite Partially Observable Markov Decision Process , 2009, NIPS.
[22] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[23] Siddhartha S. Srinivasa,et al. Inverse Optimal Heuristic Control for Imitation Learning , 2009, AISTATS.