Non-parametric Policy Search with Limited Information Loss
暂无分享,去创建一个
[1] Byron Boots,et al. Hilbert Space Embeddings of Predictive State Representations , 2013, UAI.
[2] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[3] Klaus Obermayer,et al. Construction of approximation spaces for reinforcement learning , 2013, J. Mach. Learn. Res..
[4] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[5] Jan Peters,et al. Generalizing Movements with Information-Theoretic Stochastic Optimal Control , 2014, J. Aerosp. Inf. Syst..
[6] Jan Peters,et al. Learning robot in-hand manipulation with tactile features , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[7] Marc Toussaint,et al. Path Integral Control by Reproducing Kernel Hilbert Space Embedding , 2013, IJCAI.
[8] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.
[9] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[10] André da Motta Salles Barreto,et al. Reinforcement Learning using Kernel-Based Stochastic Factorization , 2011, NIPS.
[11] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[12] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[13] Kenji Fukumizu,et al. Hilbert Space Embeddings of POMDPs , 2012, UAI.
[14] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[15] Peter L. Bartlett,et al. An Introduction to Reinforcement Learning Theory: Value Function Methods , 2002, Machine Learning Summer School.
[16] Cristina P. Santos,et al. Using Cost-regularized Kernel Regression with a high number of samples , 2014, 2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC).
[17] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[18] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[19] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[20] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[21] Guy Lever,et al. Modelling Policies in MDPs in Reproducing Kernel Hilbert Space , 2015, AISTATS.
[22] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[23] N. Roy,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2013 .
[24] Vicenç Gómez,et al. Dynamic Policy Programming with Function Approximation , 2011, AISTATS.
[25] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[26] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[27] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[28] Dirk Ormoneit,et al. Kernel-Based Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.
[29] Shie Mannor,et al. Biases and Variance in Value Function Estimates , 2004 .
[30] Alex Smola,et al. Kernel methods in machine learning , 2007, math/0701907.
[31] Alessandro Lazaric,et al. LSTD with Random Projections , 2010, NIPS.
[32] Oliver Brock,et al. Learning state representations with robotic priors , 2015, Auton. Robots.
[33] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[34] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.
[35] Neil D. Lawrence,et al. Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.
[36] Xin Xu,et al. Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.
[37] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[38] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[39] Jason Pazis,et al. Non-Parametric Approximate Linear Programming for MDPs , 2011, AAAI.
[40] Jan Peters,et al. Policy Learning - A Unified Perspective with Applications in Robotics , 2008, EWRL.
[41] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[42] Martin A. Riedmiller,et al. Learn to Swing Up and Balance a Real Pole Based on Raw Visual Input Data , 2012, ICONIP.
[43] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[44] David B. Dunson,et al. Bayesian Data Analysis , 2010 .
[45] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[46] Gavin Taylor,et al. Kernelized value function approximation for reinforcement learning , 2009, ICML '09.
[47] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[48] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[49] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[50] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[51] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[52] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.
[53] Le Song,et al. A unified kernel framework for nonparametric inference in graphical models ] Kernel Embeddings of Conditional Distributions , 2013 .
[54] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[55] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[56] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[57] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[58] Joelle Pineau,et al. Bellman Error Based Feature Generation using Random Projections on Sparse Spaces , 2013, NIPS.
[59] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[60] T. Jung,et al. Kernelizing LSPE(λ) , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[61] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[62] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[63] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[64] Brian Kingsbury,et al. A comparison between deep neural nets and kernel acoustic models for speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Martin A. Riedmiller,et al. Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[66] Gergely Neu,et al. Online learning in episodic Markovian decision processes by relative entropy policy search , 2013, NIPS.
[67] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[68] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[69] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[70] Daniele Calandriello,et al. Safe Policy Iteration , 2013, ICML.
[71] Stefan Schaal,et al. Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.
[72] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[73] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[74] Peter Englert,et al. Policy Search in Reproducing Kernel Hilbert Space , 2016, IJCAI.
[75] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[76] Matthew W. Hoffman,et al. Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.
[77] Guy Lever,et al. Conditional mean embeddings as regressors , 2012, ICML.
[78] Haibo He,et al. Kernel-Based Approximate Dynamic Programming for Real-Time Online Learning Control: An Experimental Study , 2014, IEEE Transactions on Control Systems Technology.
[79] Gunnar Rätsch,et al. Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.
[80] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[81] Jan Peters,et al. Learning of Non-Parametric Control Policies with High-Dimensional State Features , 2015, AISTATS.
[82] Oliver Kroemer,et al. A Non-Parametric Approach to Dynamic Programming , 2011, NIPS.
[83] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[84] Sergey Levine,et al. Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[85] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[86] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[87] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[88] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[89] Alexander J. Smola,et al. Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.
[90] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.