Generalized exploration in policy search
暂无分享,去创建一个
[1] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[2] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[3] Joshua B. Tenenbaum,et al. Nonparametric Bayesian Policy Priors for Reinforcement Learning , 2010, NIPS.
[4] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[5] Wouter Caarls,et al. Learning while preventing mechanical failure due to random motions , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[6] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[7] Jan Peters,et al. Learning of Non-Parametric Control Policies with High-Dimensional State Features , 2015, AISTATS.
[8] Yang Liu,et al. A new Q-learning algorithm based on the metropolis criterion , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[9] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[10] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[11] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[12] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[13] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[14] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[15] Chris Watkins,et al. Sex as Gibbs Sampling: a probability model of evolution , 2014, ArXiv.
[16] Daniel A. Braun,et al. A Minimum Relative Entropy Principle for Learning and Acting , 2008, J. Artif. Intell. Res..
[17] Stefan Schaal,et al. Hierarchical reinforcement learning with movement primitives , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.
[18] Leslie Pack Kaelbling,et al. Bayesian Policy Search with Policy Priors , 2011, IJCAI.
[19] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[20] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[21] M. Deisenroth,et al. Probabilistic Inference for Fast Learning in Control in Proceedings of the 8 th European Workshop on Reinforcement Learning ( EWRL , 2008 .
[22] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[23] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[24] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[25] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[28] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[29] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[30] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[31] Jeremy Wyatt,et al. Exploration and inference in learning from reinforcement , 1998 .
[32] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[33] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[34] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[35] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[36] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[37] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[38] Nando de Freitas,et al. Bayesian Policy Learning with Trans-Dimensional MCMC , 2007, NIPS.
[39] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[40] Darwin G. Caldwell,et al. Direct policy search reinforcement learning based on particle filtering , 2012, EWRL 2012.
[41] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[42] Sridhar Mahadevan,et al. Hierarchical Policy Gradient Algorithms , 2003, ICML.
[43] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[44] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[45] Jan Peters,et al. Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.
[46] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[47] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[48] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[49] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[50] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[51] W. K. Hastings,et al. Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .
[52] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[53] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..