Autonomous shaping: knowledge transfer in reinforcement learning
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.
[2] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[3] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .
[4] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[5] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[6] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[7] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[8] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[9] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[10] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[11] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[12] Gavin Adrian Rummery. Problem solving with reinforcement learning , 1995 .
[13] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[14] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[15] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[16] Gillian M. Hayes,et al. Robot Shaping --- Principles, Methods and Architectures , 1996 .
[17] Marco Colombetti,et al. Robot Shaping: An Experiment in Behavior Engineering , 1997 .
[18] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[19] Sven Koenig,et al. Exploring Unknown Environments with Real-Time Search or Reinforcement Learning , 1998, NIPS.
[20] Benjamin Van Roy. Learning and value function approximation in complex decision processes , 1998 .
[21] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[22] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[23] Daniel S. Bernstein,et al. Reusing Old Policies to Accelerate Learning on New MDPs , 1999 .
[24] Andrew G. Barto,et al. Combining Reinforcement Learning with a Local Control Algorithm , 2000, ICML.
[25] Manuela M. Veloso,et al. Layered Learning , 2000, ECML.
[26] James L. McClelland,et al. Autonomous Mental Development by Robots and Animals , 2001, Science.
[27] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[28] Eric Wiewiora,et al. Potential-Based Shaping and Q-Value Initialization are Equivalent , 2003, J. Artif. Intell. Res..
[29] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[30] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[31] Gillian M. Hayes,et al. Estimating Future Reward in Reinforcement Learning Animats using Associative Learning , 2004 .
[32] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[33] Aude Billard,et al. Estimating Future Reward in Reinforcement Learning Animats using Associative Learning , 2004 .
[34] Peter Stone,et al. Value Functions for RL-Based Behavior Transfer: A Comparative Study , 2005, AAAI.
[35] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[36] Reid G. Simmons,et al. The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms , 2005, Machine Learning.
[37] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.