Multi-task reinforcement learning : shaping and feature selection

Shaping functions can be used in multi-task reinforcement learning (RL) to incorporate knowledge from previously experienced source tasks to speed up learning on a new target task. Earlier work has not clearly motivated choices for the shaping function. This paper discusses and empirically compares several alternatives, and demonstrates that the most intuive one may not always be the best option. In addition, we extend previous work on identifying good representations for the value and shaping functions, and show that selecting the right representation results in improved generalization over tasks.

[1]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[2]  Tanaka Fumihide,et al.  Multitask Reinforcement Learning on the Distribution of MDPs , 2003 .

[3]  Garrison W. Cottrell,et al.  Principled Methods for Advising Reinforcement Learning Agents , 2003, ICML.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  Peter Stone,et al.  State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.

[7]  Andrew G. Barto,et al.  Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.

[8]  Thomas J. Walsh,et al.  Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.

[9]  Thomas J. Walsh Transferring State Abstractions Between MDPs , 2006 .

[10]  Alan Fern,et al.  Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.

[11]  Henrik I. Christensen,et al.  Co-evolution of Shaping Rewards and Meta-Parameters in Reinforcement Learning , 2008, Adapt. Behav..

[12]  Lihong Li,et al.  An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.

[13]  Michael L. Littman,et al.  Potential-based Shaping in Model-based Reinforcement Learning , 2008, AAAI.

[14]  Satinder P. Singh,et al.  Transfer via soft homomorphisms , 2009, AAMAS.

[15]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[16]  Richard L. Lewis,et al.  Where Do Rewards Come From , 2009 .

[17]  Alessandro Lazaric,et al.  Bayesian Multi-Task Reinforcement Learning , 2010, ICML.

[18]  Marek Petrik,et al.  Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.

[19]  Masashi Sugiyama,et al.  Feature Selection for Reinforcement Learning: Evaluating Implicit State-Reward Dependency via Conditional Mutual Information , 2010, ECML/PKDD.

[20]  Shimon Whiteson,et al.  Multi-task evolutionary shaping without pre-specified representations , 2010, GECCO '10.

[21]  Sridhar Mahadevan,et al.  Representation Discovery in Sequential Decision Making , 2010, AAAI.