论文信息 - Weight Space Probability Densities in Stochastic Learning: II. Transients and Basin Hopping Times

Weight Space Probability Densities in Stochastic Learning: II. Transients and Basin Hopping Times

In stochastic learning, weights are random variables whose time evolution is governed by a Markov process. At each time-step, n, the weights can be described by a probability density function P(w, n). We summarize the theory of the time evolution of P, and give graphical examples of the time evolution that contrast the behavior of stochastic learning with true gradient descent (batch learning). Finally, we use the formalism to obtain predictions of the time required for noise-induced hopping between basins of different optima. We compare the theoretical predictions with simulations of large ensembles of networks for simple problems in supervised and unsupervised learning.

Todd K. Leen | Genevieve B. Orr | G. Orr | T. Leen

[1] E. Oja. Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[2] C. Gardiner. Handbook of Stochastic Methods , 1983 .

[3] H. Kushner. Asymptotic global behavior for stochastic approximation and diffusions with slowly decreasing noise effects: Global minimization via Monte Carlo , 1987 .

[4] Halbert White,et al. Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[5] John E. Moody,et al. Note on Learning Rate Schedules for Stochastic Optimization , 1990, NIPS.

[6] H. G. Schuster,et al. Fokker-Planck Description of Learning in Backpropagation Networks , 1990 .

[7] P. Lisboa,et al. Complete solution of the local minima in the XOR problem , 1991 .

[8] G. Orr,et al. Weight-space probability densities and convergence times for stochastic learning , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[9] John E. Moody,et al. Weight Space Probability Densities in Stochastic Learning: I. Dynamics and Equilibria , 1992, NIPS.