Exemplar models are useful and deep neural networks overcome their limitations: A commentary on Ambridge (2020)

Humans are sensitive to the properties of individual items, and exemplar models are useful for capturing this sensitivity. I am a proponent of an extension of exemplar-based architectures that I briefly describe. However, exemplar models are very shallow architectures in which it is necessary to stipulate a set of primitive elements that make up each example, and such architectures have not been as successful as deep neural networks in capturing language usage and meaning. More work is needed bringing contemporary deep learning architectures used in machine intelligence to the effort to understand human language processing.

[1]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[2]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[3]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[4]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[5]  Hinrich Schütze,et al.  Extending Machine Language Models toward Human-Level Language Understanding , 2019, ArXiv.

[6]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[7]  James L. McClelland,et al.  Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  B. Ambridge Against stored abstractions: A radical exemplar model of language acquisition , 2020, First Language.

[10]  Douglas L. Medin,et al.  Context theory of classification learning. , 1978 .

[11]  James L. McClelland Capturing Gradience, Continuous Change, and Quasi‐Regularity in Sound, Word, Phrase, and Meaning , 2015 .

[12]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .