Hierarchical Relative Entropy Policy Search
暂无分享,去创建一个
[1] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[2] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[3] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[4] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[9] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[10] Andrew G. Barto,et al. Robot Weightlifting By Direct Policy Search , 2001, IJCAI.
[11] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[12] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[13] Peter Stone,et al. Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.
[14] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[15] Andrew W. Moore,et al. Policy Search using Paired Comparisons , 2003, J. Mach. Learn. Res..
[16] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[17] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[18] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[19] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[20] Sridhar Mahadevan,et al. Hierarchical Policy Gradient Algorithms , 2003, ICML.
[21] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[22] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[23] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[24] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[25] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[26] Marc Toussaint,et al. Probabilistic inference for solving (PO) MDPs , 2006 .
[27] Ben Taskar,et al. Expectation Maximization and Posterior Constraints , 2007, NIPS.
[28] Jan Peters,et al. Fitted Q-iteration by Advantage Weighted Regression , 2008, NIPS.
[29] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[30] Betty J. Mohler,et al. Learning perceptual coupling for motor primitives , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[31] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[32] Jan Peters,et al. Learning complex motions by sequencing simpler motion templates , 2009, ICML '09.
[33] Ben Taskar,et al. Posterior vs Parameter Sparsity in Latent Variable Models , 2009, NIPS.
[34] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[35] Jan Peters,et al. Policy Search for Motor Primitives , 2009, Künstliche Intell..
[36] Christoph H. Lampert,et al. Movement templates for learning of hitting and batting , 2010, 2010 IEEE International Conference on Robotics and Automation.
[37] Darwin G. Caldwell,et al. Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[38] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[39] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[40] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .
[41] Marc Toussaint,et al. Approximate Inference and Stochastic Optimal Control , 2010, ArXiv.
[42] Jun Morimoto,et al. Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.
[43] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..
[44] Nahum Shimkin,et al. Unified Inter and Intra Options Learning Using Policy Gradient Methods , 2011, EWRL.
[45] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[46] Gerhard Neumann,et al. Variational Inference for Policy Search in changing situations , 2011, ICML.
[47] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[48] Stefan Schaal,et al. Hierarchical reinforcement learning with movement primitives , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.
[49] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[50] Leslie Pack Kaelbling,et al. Bayesian Policy Search with Policy Priors , 2011, IJCAI.
[51] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[52] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[53] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[54] Jan Peters,et al. Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[55] Hilbert J. Kappen,et al. Dynamic policy programming , 2010, J. Mach. Learn. Res..
[56] Olivier Sigaud,et al. Policy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning , 2012 .
[57] Andrew G. Barto,et al. Motor primitive discovery , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[58] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[59] Darwin G. Caldwell,et al. Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning , 2013, Robotics Auton. Syst..
[60] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[61] Oliver Kroemer,et al. Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.
[62] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.