论文信息 - Variable Metric Stochastic Approximation Theory

Variable Metric Stochastic Approximation Theory

We provide a variable metric stochastic approximation theory. In doing so, we provide a convergence theory for a large class of online variable metric methods including the recently introduced online versions of the BFGS algorithm and its limited-memory LBFGS variant. We also discuss the implications of our results for learning from expert advice.

[1] J. Blum. Multidimensional Stochastic Approximation Methods , 1954 .

[2] H. Robbins,et al. A Convergence Theorem for Non Negative Almost Supermartingales and Some Applications , 1985 .

[3] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[4] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[5] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[6] Noboru Murata,et al. A Statistical Study on On-line Learning , 1999 .

[7] Shun-ichi Amari,et al. Methods of information geometry , 2000 .

[8] Ali H. Sayed,et al. Linear Estimation (Information and System Sciences Series) , 2000 .

[9] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[10] Nicol N. Schraudolph,et al. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.

[11] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[12] Manfred K. Warmuth,et al. Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[13] Léon Bottou,et al. On-line learning for very large data sets , 2005 .

[14] H. Robbins. A Stochastic Approximation Method , 1951 .

[15] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[16] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.

[17] Yoav Shoham,et al. Eliciting properties of probability distributions , 2008, EC '08.

[18] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..