论文信息 - An interior-point stochastic approximation method and an L1-regularized delta rule

An interior-point stochastic approximation method and an L1-regularized delta rule

The stochastic approximation method is behind the solution to many important, actively-studied problems in machine learning. Despite its far-reaching application, there is almost no work on applying stochastic approximation to learning problems with general constraints. The reason for this, we hypothesize, is that no robust, widely-applicable stochastic approximation method exists for handling such problems. We propose that interior-point methods are a natural solution. We establish the stability of a stochastic interior-point approximation method both analytically and empirically, and demonstrate its utility by deriving an on-line learning algorithm that also performs feature selection via L1 regularization.

Mark W. Schmidt | Nando de Freitas | Peter Carbonetto | N. D. Freitas | P. Carbonetto

[1] H. Robbins. A Stochastic Approximation Method , 1951 .

[2] Margaret H. Wright,et al. Some properties of the Hessian of the logarithmic barrier function , 1994, Math. Program..

[3] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[4] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[5] James C. Spall,et al. Adaptive stochastic approximation by the simultaneous perturbation method , 2000, IEEE Trans. Autom. Control..

[6] M. J. D. Powell,et al. Nonlinear Programming—Sequential Unconstrained Minimization Techniques , 1969 .

[7] Stephen P. Boyd,et al. An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[8] B. T. Poljak. Nonlinear programming methods in the presence of noise , 1978, Math. Program..

[9] L. Trefethen,et al. Numerical linear algebra , 1997 .

[10] J. Spall,et al. Stochastic optimization with inequality constraints using simultaneous perturbations and penalty functions , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[11] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[12] Mark W. Schmidt,et al. Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[13] Stephen J. Wright. Effects of Finite-Precision Arithmetic on Interior-Point Methods for Nonlinear Programming , 2001, SIAM J. Optim..

[14] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[15] David Madigan,et al. Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[16] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[17] J. Spall,et al. Model-free control of nonlinear stochastic systems with discrete-time measurements , 1998, IEEE Trans. Autom. Control..

[18] T. Tsuchiya,et al. On the formulation and theory of the Newton interior-point method for nonlinear programming , 1996 .

[19] Michael I. Jordan,et al. Statistical software debugging , 2005 .

[20] Gordon V. Cormack,et al. Spam Corpus Creation for TREC , 2005, CEAS.

[21] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[22] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[23] Anders Forsgren,et al. Interior Methods for Nonlinear Optimization , 2002, SIAM Rev..

[24] Gordon V. Cormack,et al. Online supervised spam filter evaluation , 2007, TOIS.

[25] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[26] Margaret H. Wright,et al. Ill-Conditioning and Computational Error in Interior Methods for Nonlinear Programming , 1998, SIAM J. Optim..

[27] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..