Adaptive Mixtures of Local Experts

We present a new supervised learning procedure for systems composed of many separate networks, each of which learns to handle a subset of the complete set of training cases. The new procedure can be viewed either as a modular version of a multilayer supervised network, or as an associative version of competitive learning. It therefore provides a new link between these two apparently different approaches. We demonstrate that the learning procedure divides up a vowel discrimination task into appropriate subtasks, each of which can be solved by a very simple expert network.

[1]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[2]  W. Dorland,et al.  Interplanetary and interstellar plasma turbulence , 2006, astro-ph/0610810.

[3]  A G Barto,et al.  Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[4]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[5]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[6]  Steven J. Nowlan,et al.  Maximum Likelihood Competitive Learning , 1989, NIPS.

[7]  Michael I. Jordan,et al.  A Modular Connectionist Architecture For Learning Piecewise Control Strategies , 1991, 1991 American Control Conference.

[8]  Terence D. Sanger,et al.  A Tree-Structured Algorithm for Reducing Computation in Networks with Separable Basis Functions , 1991, Neural Computation.

[9]  Michael I. Jordan,et al.  Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..

[10]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[11]  Alexander H. Waibel,et al.  The Meta-Pi Network: Building Distributed Knowledge Representations for Robust Multisource Pattern Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Michael I. Jordan,et al.  Computational Consequences of a Bias toward Short Connections , 1992, Journal of Cognitive Neuroscience.

[13]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[14]  W. Bechtel The Path Beyond First‐Order Connectionism , 1993 .

[15]  A. Karmiloff-Smith,et al.  What's Special about the Development of the Human Mind/Brain? , 1993 .

[16]  Lei Xu,et al.  Least mean square error reconstruction principle for self-organizing neural-nets , 1993, Neural Networks.

[17]  James L. McClelland,et al.  Computational approaches to cognition: top-down approaches , 1993, Current Opinion in Neurobiology.

[18]  Geoffrey E. Hinton,et al.  Learning Mixture Models of Spatial Coherence , 1993, Neural Computation.

[19]  T. Keerthi,et al.  A Tutorial Survey of Reinforcement Learn , .