论文信息 - A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization

A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization

A parallel stochastic algorithm is investigated for error-descent learning and optimization in deterministic networks of arbitrary topology. No explicit information about internal network structure is needed. The method is based on the model-free distributed learning mechanism of Dembo and Kailath. A modified parameter update rule is proposed by which each individual parameter vector perturbation contributes a decrease in error. A substantially faster learning speed is hence allowed. Furthermore, the modified algorithm supports learning time-varying features in dynamical networks. We analyze the convergence and scaling properties of the algorithm, and present simulation results for dynamic trajectory learning in recurrent networks.

Gert Cauwenberghs | G. Cauwenberghs

[1] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[2] Barak A. Pearlmutter. Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[3] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[4] Bernard Widrow,et al. 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[5] Thomas Kailath,et al. Model-free distributed learning , 1990, IEEE Trans. Neural Networks.

[6] Stuart Haber,et al. A VLSI-efficient technique for generating multiple uncorrelated noise sources and its application to stochastic neural networks , 1991 .

[7] Marwan A. Jabri,et al. Weight Perturbation: An Optimal Architecture and Learning Technique for Analog VLSI Feedforward and Recurrent Multilayer Networks , 1991, Neural Comput..

[8] Kurt W. Fleischer,et al. Analog VLSI Implementation of Gradient Descent , 1992, NIPS.

[9] Marwan A. Jabri,et al. Summed Weight Neuron Perturbation: An O(N) Improvement Over Weight Perturbation , 1992, NIPS.

[10] Jürgen Schmidhuber,et al. A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.

[11] Ron Meir,et al. A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks , 1992, NIPS.

[12] Jacob Barhen,et al. Learning a trajectory using adjoint functions and teacher forcing , 1992, Neural Networks.