论文信息 - Online Learning on Spheres

Online Learning on Spheres

The classical on-line learning algorithms such as the LMS (Least Mean Square) algorithm have attracted renewed interest recently because of many variants based on non-standard parametrisations that bias the algorithms in favour of certain prior knowledge of the problem. Tools such as link functions and Bregman divergences have been used in the design and analysis of such algorithms. In this paper we reconsider the development of such variants of classical stochastic gradient descent (SGD) in a purely geometric setting. The Bregman divergence is replaced by the (squared) Riemannian distance. The property of convexity of a loss function is replaced by a more general notion of compatibility with a metric. The ideas are explicated in the development and analysis of an algorithm for online learning on spheres.

Manfred K. Warmuth | R. C. Williamson | R. Mahony | K. Krakowski

[1] Gunnar Rätsch,et al. Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection , 2004, J. Mach. Learn. Res..

[2] Manfred K. Warmuth,et al. Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[3] Robert E. Mahony,et al. Prior Knowledge and Preferential Structures in Gradient Descent Learning Algorithms , 2001, J. Mach. Learn. Res..

[4] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[5] Nicolò Cesa-Bianchi,et al. Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[6] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[7] John M. Lee. Riemannian Manifolds: An Introduction to Curvature , 1997 .

[8] W. Kendall. Convexity and the Hemisphere , 1991 .

[9] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .

[10] G. S. Watson. Statistics on Spheres , 1983 .

[11] H. Karcher. Riemannian center of mass and mollifier smoothing , 1977 .

[12] K. Mardia. Statistics of Directional Data , 1972 .