Trans-dimensional MCMC for Bayesian policy learning
暂无分享,去创建一个
Matthew W. Hoffman | Nando de Freitas | Arnaud Doucet | Ajay Jasra | A. Doucet | N. D. Freitas | A. Jasra
[1] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[2] R. Vollertsen,et al. Burn-In , 1999, 1999 IEEE International Integrated Reliability Workshop Final Report (Cat. No. 99TH8460).
[3] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[4] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[5] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[6] J. Bernardo,et al. Simulation-Based Optimal Design , 1999 .
[7] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[8] P. Green,et al. Trans-dimensional Markov chain Monte Carlo , 2000 .
[9] Noah J. Cowan,et al. Efficient Gradient Estimation for Motor Control Learning , 2002, UAI.
[10] Alexander J. Smola,et al. Neural Information Processing Systems , 1997, NIPS 1997.
[11] Rajesh P. N. Rao,et al. Planning and Acting in Uncertain Environments using Probabilistic Inference , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[12] P. Green,et al. Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .
[13] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[14] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[15] Geoffrey E. Hinton,et al. Using EM for Reinforcement Learning , 2000 .
[16] Marc Toussaint,et al. Probabilistic inference for solving (PO) MDPs , 2006 .
[17] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[18] Nando de Freitas,et al. An Introduction to MCMC for Machine Learning , 2004, Machine Learning.
[19] Hagai Attias,et al. Planning by Probabilistic Inference , 2003, AISTATS.
[20] P. Müller,et al. Optimal Bayesian Design by Inhomogeneous Markov Chain Simulation , 2004 .
[21] P. Green. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .
[22] Nando de Freitas,et al. Fast particle smoothing: if I had a million particles , 2006, ICML.
[23] Pascal Poupart,et al. Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..