Online Learning on Spheres

The classical on-line learning algorithms such as the LMS (Least Mean Square) algorithm have attracted renewed interest recently because of many variants based on non-standard parametrisations that bias the algorithms in favour of certain prior knowledge of the problem. Tools such as link functions and Bregman divergences have been used in the design and analysis of such algorithms. In this paper we reconsider the development of such variants of classical stochastic gradient descent (SGD) in a purely geometric setting. The Bregman divergence is replaced by the (squared) Riemannian distance. The property of convexity of a loss function is replaced by a more general notion of compatibility with a metric. The ideas are explicated in the development and analysis of an algorithm for online learning on spheres.