Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates
暂无分享,去创建一个
[1] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[2] Emma Brunskill,et al. Concurrent PAC RL , 2015, AAAI.
[3] Peter Stone,et al. RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control , 2011, 2012 IEEE International Conference on Robotics and Automation.
[4] Jonathan P. How,et al. Sample Efficient Reinforcement Learning with Gaussian Processes , 2014, ICML.
[5] Csaba Szepesvári,et al. Model-based reinforcement learning with nearly tight exploration complexity bounds , 2010, ICML.
[6] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[7] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[8] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.
[9] Michael L. Littman,et al. A unifying framework for computational reinforcement learning theory , 2009 .
[10] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[13] Jason Pazis,et al. PAC Optimal Exploration in Continuous Space Markov Decision Processes , 2013, AAAI.
[14] Peter Stone,et al. Intrinsically motivated model learning for a developing curious agent , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[15] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[16] Timothy A. Mann. Scaling Up Reinforcement Learning without Sacrificing Optimality by Constraining Exploration , 2012 .
[17] Tor Lattimore,et al. PAC Bounds for Discounted MDPs , 2012, ALT.
[18] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.