论文信息 - A Study of Error Variance Estimation in Lasso Regression

A Study of Error Variance Estimation in Lasso Regression

Variance estimation in the linear model when $p > n$ is a difficult problem. Standard least squares estimation techniques do not apply. Several variance estimators have been proposed in the literature, all with accompanying asymptotic results proving consistency and asymptotic normality under a variety of assumptions. It is found, however, that most of these estimators suffer large biases in finite samples when true underlying signals become less sparse with larger per element signal strength. One estimator seems to be largely neglected in the literature: a residual sum of squares based estimator using Lasso coefficients with regularisation parameter selected adaptively (via cross-validation). In this paper, we review several variance estimators and perform a reasonably extensive simulation study in an attempt to compare their finite sample performance. It would seem from the results that variance estimators with adaptively chosen regularisation parameters perform admirably over a broad range of sparsity and signal strength settings. Finally, some intial theoretical analyses pertaining to these types of estimators are proposed and developed.

R. Tibshirani | J. Friedman | Stephen Reid

[1] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[2] Jianqing Fan,et al. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[3] Y. Ritov,et al. Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[4] L. Wasserman,et al. HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[5] Cun-Hui Zhang,et al. Comments on: ℓ1-penalization for mixture regression models , 2010 .

[6] S. Geer,et al. ℓ1-penalization for mixture regression models , 2010, 1202.6046.

[7] Jianqing Fan,et al. Comments on: ℓ1-penalization for mixture regression models , 2010 .

[8] Cun-Hui Zhang,et al. Scaled sparse linear regression , 2011, 1104.4595.

[9] Jianqing Fan,et al. Variance estimation using refitted cross‐validation in ultrahigh dimensional regression , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[10] Cun-Hui Zhang,et al. Sparse matrix inversion with scaled Lasso , 2012, J. Mach. Learn. Res..

[11] Daniel J. McDonald,et al. The lasso, persistence, and cross-validation , 2013, ICML.

[12] Adel Javanmard,et al. Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[13] R. Tibshirani,et al. A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[14] Lee H. Dicker,et al. Variance estimation in high-dimensional linear models , 2014 .