Lazy Explanation-based Learning: a Solu- Tion to the Intractable Theory Problem. in a Self- Organizing Retrieval System for Graphs. in 6. Conclusions and Ongoing Directions

A chess program that uses its transposition table to learn from experience. Getting systems to develop their knowledge bases from experience is a diicult but important challenge. It is hoped that the example the Morph project provides of how several learning methods can be combined to exploit experience in the form of pws will provide at least one framework on which research in experiential learning may proceed. The Morph project will be continuing over a number of years as we strive to bring its playing level up to the current brute-force chess machines. To do this the eeort will be to make the learning mechanisms and their mutual cooperation as sharp as possible. Hopefully, this will prepare the way for more compelling applications of these methods beyond chess. For example, it may be possible in organic synthesis systems to improve search time with experience using similar graph methods Levinson, 1991b]. The following points are worth remembering: In combining the many learning methods in the Morph system we have not taken the methods as they are normally used but have extracted their essence and combined them beneecially. Guided by appropriate performance measures , modiication and testing of the system proceeds systematically. Interesting ideas arise directly as a result of taking the multi-strategy view. Some examples: 1. The genetic inversion operator described in Section 2.4.2. 2. Optimal pattern population. Just as we are trying to get the learning methods to work in harmony, we are attempting the same coordination with Morph's patterns. The idea is to get a set of patterns that are good predictors as a whole rather than to nd strong individual patterns (though the latter may be part of the former). 3. Higher level concepts via hidden units. Once a good set of patterns has been obtained we may be able to introduce a more sophisticated evaluation function. This function , patterned after neural nets, would have hidden units that extract higher level interactions between the patterns. For example, conjunctions and disjunctions may be realized and given weights different from that implied by their components. Acknowledgments Thank you to Jee Keller for constructing the new move selection evaluation function, to Paul Zola and Kamal Mostafa for the initial Morph implementation, and to Richard Sutton for sharing our enthusiasm for reinforcement learning.