An Analysis of Case-Based Value Function Approximation by Approximating State Transition Graphs

We identify two fundamental points of utilizing CBR for an adaptive agent that tries to learn on the basis of trial and error without a model of its environment. The first link concerns the utmost efficient exploitation of experience the agent has collected by interacting within its environment, while the second relates to the acquisition and representation of a suitable behavior policy. Combining both connections, we develop a state-action value function approximation mechanism that relies on case-based, approximate transition graphs and forms the basis on which the agent improves its behavior. We evaluate our approach empirically in the context of dynamic control tasks.

[1]  Luc Lamontagne,et al.  Case-Based Reasoning Research and Development , 1997, Lecture Notes in Computer Science.

[2]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[3]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[4]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[5]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[6]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[7]  Jing Peng,et al.  Efficient Memory-Based Dynamic Programming , 1995, ICML.

[8]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[9]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[10]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[11]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[12]  Peter Cheeseman,et al.  Selecting Models from Data: Artificial Intelligence and Statistics IV , 1994 .

[13]  John R. Anderson,et al.  Machine learning - an artificial intelligence approach , 1982, Symbolic computation.

[14]  Barry Smyth,et al.  Advances in Case-Based Reasoning , 1996, Lecture Notes in Computer Science.

[15]  Amílcar Cardoso,et al.  Using CBR in the Exploration of Unknown Environments with an Autonomous Agent , 2004, ECCBR.

[16]  Martin A. Riedmiller,et al.  CBR for State Value Function Approximation in Reinforcement Learning , 2005, ICCBR.

[17]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[19]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[20]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[21]  Jay H. Powell,et al.  Evaluating the Effectiveness of Exploration and Accumulated Experience in Automatic Case Elicitation , 2005, ICCBR.

[22]  David W. Aha,et al.  Learning to Catch: Applying Nearest Neighbor Algorithms to Dynamic Control Tasks , 1994 .