论文信息 - Forward stagewise regression and the monotone lasso - 字舞流文

Forward stagewise regression and the monotone lasso

We consider the least angle regression and forward stagewise algorithms for solving penalized least squares regression problems. In Efron, Hastie, Johnstone & Tibshirani (2004) it is proved that the least angle regression algorithm, with a small modification, solves the lasso regression problem. Here we give an analogous result for incremental forward stagewise regression, showing that it solves a version of the lasso problem that enforces monotonicity. One consequence of this is as follows: while lasso makes optimal progress in terms of reducing the residual sum-of-squares per unit increase in $L_1$-norm of the coefficient $\beta$, forward stage-wise is optimal per unit $L_1$ arc-length traveled along the coefficient path. We also study a condition under which the coefficient paths of the lasso are monotone, and hence the different algorithms coincide. Finally, we compare the lasso and forward stagewise procedures in a simulation study involving a large number of correlated predictors.

R. Tibshirani | T. Hastie | Jonathan E. Taylor | A. Dalalyan | G. Walther | L. Comminges

[1] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[2] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[3] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[4] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[5] M. R. Osborne,et al. A new approach to variable selection in least squares problems , 2000 .

[6] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[7] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.

[8] Joel A. Tropp,et al. Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[9] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[10] P. Zhao. Boosted Lasso , 2004 .

[11] Bogdan E. Popescu,et al. Gradient Directed Regularization , 2004 .

[12] Saharon Rosset,et al. Tracking Curved Regularized Optimization Solution Paths , 2004, NIPS 2004.

[13] Mee Young Park,et al. L 1-regularization path algorithm for generalized linear models , 2006 .

[14] D. Hinkley. Annals of Statistics , 2006 .

[15] P. Bühlmann. Boosting for high-dimensional linear models , 2006 .

[16] Joel A. Tropp,et al. Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[17] Mee Young Park,et al. L1‐regularization path algorithm for generalized linear models , 2007 .

[18] S. Rosset,et al. Piecewise linear regularized solution paths , 2007, 0708.2197.

[19] Yaakov Tsaig,et al. Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[20] D. Donoho,et al. Fast Solution of -Norm Minimization Problems When the Solution May Be Sparse , 2008 .