Template-based procedures for neural network interpretation

Although neural networks often achieve impressive learning and generalization performance, their internal workings are typically all but impossible to decipher. This characteristic of the networks, their opacity, is one of the disadvantages of connectionism compared to more traditional, rule-oriented approaches to artificial intelligence. Without a thorough understanding of the network behavior, confidence in a system's results is lowered, and the transfer of learned knowledge to other processing systems - including humans - is precluded. Methods that address the opacity problem by casting network weights in symbolic terms are commonly referred to as rule extraction techniques. This work describes a principled approach to symbolic rule extraction from standard multilayer feedforward networks based on the notion of weight templates, parameterized regions of weight space corresponding to specific symbolic expressions. With an appropriate choice of representation, we show how template parameters may be efficiently identified and instantiated to yield the optimal match to the actual weights of a unit. Depending on the requirements of the application domain, the approach can accommodate n-ary disjunctions and conjunctions with O(k) complexity, simple n-of-m expressions with O(k(2)) complexity, or more general classes of recursive n-of-m expressions with O(k(L+2)) complexity, where k is the number of inputs to an unit and L the recursion level of the expression class. Compared to other approaches in the literature, our method of rule extraction offers benefits in simplicity, computational performance, and overall flexibility. Simulation results on a variety of problems demonstrate the application of our procedures as well as the strengths and the weaknesses of our general approach.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  John Dinsmore,et al.  The symbolic and connectionist paradigms : closing the gap , 1992 .

[3]  Marvin Minsky,et al.  Logical Versus Analogical or Symbolic Versus Connectionist or Neat Versus Scruffy , 1991, AI Mag..

[4]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[5]  Stephen I. Gallant,et al.  Connectionist expert systems , 1988, CACM.

[6]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[7]  Fu,et al.  Integration of neural heuristics into knowledge-based inference , 1989 .

[8]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce , 1987 .

[9]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[10]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[11]  Geoffrey E. Hinton,et al.  A Distributed Connectionist Production System , 1988, Cogn. Sci..

[12]  LiMin Fu,et al.  Rule Learning by Searching on Adapted Nets , 1991, AAAI.

[13]  Yoichi Hayashi,et al.  A Neural Expert System with Automated Extraction of Fuzzy If-Then Rules , 1990, NIPS.

[14]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[15]  Geoffrey E. Hinton,et al.  Adaptive Soft Weight Tying using Gaussian Mixtures , 1991, NIPS.

[16]  Charles P. Dolan,et al.  Tensor Product Production System: a Modular Architecture and Representation , 1989 .

[17]  Jude W. Shavlik,et al.  Interpretation of Artificial Neural Networks: Mapping Knowledge-Based Neural Networks into Rules , 1991, NIPS.

[18]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[19]  John H. Holland,et al.  Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[20]  Clayton McMillan Rule induction in a neural network through integrated symbolic and subsymbolic processing , 1992 .

[21]  Robert Dale,et al.  Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society , 1991 .

[22]  R. Sun Connectionist Models of Rule-Based ReasoningRon , 1991 .

[23]  Michael C. Mozer,et al.  Rule Induction through Integrated Symbolic and Subsymbolic Processing , 1991, NIPS.

[24]  Geoffrey G. Towell,et al.  Symbolic knowledge and neural networks: insertion, refinement and extraction , 1992 .

[25]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[26]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.