ARTIFICIAL NEURAL NETWORKS OF THE PERCEPTRON, MADALINE, AND BACKPROPAGATION FAMILY

ABSTRACT Fundamental developments in feedforward artificial neural networks from the past thirty years are reviewed. The central theme of this paper is a description of the history, origination, operating characteristics, and basic theory of several supervised neural network training algorithms including the Perceptron rule, the LMS algorithm, three Madaline rules, and the backpropagation technique. These methods were developed independently, but with the perspective of history they can all be related to each other. The concept which underlies these algorithms is the “minimal disturbance principle”, which suggests that during training it is advisable to inject new information into a network in a manner which disturbs stored information to the smallest extent possible. Learning algorithms used in artificial neural networks are probably not representative of learning processes in living neural systems. However, study of these algorithms may give neurobiologists and psychologists some clues of what to look for when studying cognition, pattern classification, and locomotion.

[1]  Esther Levin,et al.  Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..

[2]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[3]  D. G. Bounds,et al.  A multilayer perceptron network for the diagnosis of low back pain , 1988, IEEE 1988 International Conference on Neural Networks.

[4]  A. Owens,et al.  Efficient training of the backpropagation network by solving a system of stiff ordinary differential equations , 1989, International 1989 Joint Conference on Neural Networks.

[5]  Donald F. Specht,et al.  Generation of Polynomial Discriminant Functions for Pattern Recognition , 1967, IEEE Trans. Electron. Comput..

[6]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[7]  Roman Bek,et al.  Discourse on one way in which a quantum-mechanics language on the classical logical base can be built up , 1978, Kybernetika.

[8]  Bernard Widrow,et al.  The least mean fourth (LMF) adaptive algorithm and its family , 1984, IEEE Trans. Inf. Theory.

[9]  Colin Giles,et al.  Learning, invariance, and generalization in high-order neural networks. , 1987, Applied optics.

[10]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[11]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[12]  Yann LeCun,et al.  A theoretical framework for back-propagation , 1988 .

[13]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[14]  B. Widrow,et al.  Adaptive antenna systems , 1967 .

[15]  Lawrence W. Stark,et al.  Computer pattern recognition techniques: electrocardiographic diagnosis , 1962, CACM.

[16]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[17]  Alberto L. Sangiovanni-Vincentelli,et al.  Efficient Parallel Learning Algorithms for Neural Networks , 1988, NIPS.

[18]  Bernard Widrow,et al.  Adaptive inverse control , 1987, Proceedings of 8th IEEE International Symposium on Intelligent Control.

[19]  Richard Fozzard,et al.  A Connectionist Expert System that Actually Works , 1988, NIPS.

[20]  M. M. Sondhi,et al.  An adaptive echo canceller , 1967 .

[21]  Kathleen J. Mullen,et al.  Agricultural Policies in India , 2018, OECD Food and Agricultural Reviews.

[22]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[23]  H. D. Block The perceptron: a model for brain functioning. I , 1962 .

[24]  P. M. Shea,et al.  Detection of explosives in checked airline baggage using an artificial neural system , 1989, International 1989 Joint Conference on Neural Networks.

[25]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[26]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[27]  D F Specht,et al.  Vectorcardiographic diagnosis using the polynomial discriminant method of pattern recognition. , 1967, IEEE transactions on bio-medical engineering.

[28]  Eric B. Baum,et al.  Supervised Learning of Probability Distributions by Neural Networks , 1987, NIPS.

[29]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[30]  C. Lee Giles,et al.  Encoding Geometric Invariances in Higher-Order Neural Networks , 1987, NIPS.

[31]  Patrick van der Smagt,et al.  Introduction to neural networks , 1995, The Lancet.

[32]  Stephen Grossberg,et al.  Art 3: Self-Organization of Distributed Pattern Recognition Codes in Neural Network Hierarchies , 1990 .

[33]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Yoh-Han Pao,et al.  Functional link nets: removing hidden layers , 1989 .

[35]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[36]  B Kosko,et al.  Adaptive bidirectional associative memories. , 1987, Applied optics.

[37]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  Karl Steinbuch,et al.  Learning Matrices and Their Applications , 1963, IEEE Trans. Electron. Comput..

[39]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[40]  A. G. Ivakhnenko,et al.  Polynomial Theory of Complex Systems , 1971, IEEE Trans. Syst. Man Cybern..

[41]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[42]  F. K. Becker,et al.  Automatic equalization for digital communication , 1965 .

[43]  Thomas Kailath,et al.  A view of three decades of linear filtering theory , 1974, IEEE Trans. Inf. Theory.

[44]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[45]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[46]  H. W. Bode,et al.  A Simplified Derivation of Linear Least Square Smoothing and Prediction Theory , 1950, Proceedings of the IRE.

[47]  J. Shynk,et al.  The LMS algorithm with momentum updating , 1988, 1988., IEEE International Symposium on Circuits and Systems.

[48]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[49]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[50]  Luís B. Almeida,et al.  A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[51]  Bernard Widrow,et al.  Neural nets for adaptive filtering and adaptive pattern recognition , 1988, Computer.

[52]  B. Widrow,et al.  Adaptive noise cancelling: Principles and applications , 1975 .

[53]  Rodney Gerard Winter,et al.  Madaline Rule II : a new method for training networks of Adalines , 1989 .

[54]  S. Tam,et al.  An electrically trainable artificial neural network (ETANN) with 10240 'floating gate' synapses , 1990, International 1989 Joint Conference on Neural Networks.

[55]  W. Thomas Miller,et al.  Sensor-based control of robotic manipulators using a general learning algorithm , 1987, IEEE J. Robotics Autom..