Exact Post-selection Inference for Forward Stepwise and Least Angle Regression

In this paper we propose new inference tools for forward stepwise and least angle regression. We first present a general scheme to perform valid inference after any selection event that can be characterized as the observation vector y falling into some polyhedral set. This framework then allows us to derive conditional (post-selection) hypothesis tests at any step of the forward stepwise and least angle regression procedures. We derive an exact null distribution for our proposed test statistics in finite samples, yielding p-values with exact type I error control. The tests can also be inverted to produce confidence intervals for appropriate underlying regression parameters. Application of this framework to general likelihood-based regression models (e.g., generalized linear models and the Cox model) is also discussed.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[3]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[4]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[5]  Elizaveta Levina,et al.  Discussion of "Stability selection" by N. Meinshausen and P. Buhlmann , 2010 .

[6]  Lu Tian,et al.  A Perturbation Method for Inference on Regularized Regression Estimates , 2011, Journal of the American Statistical Association.

[7]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[8]  Cun-Hui Zhang,et al.  Confidence Intervals for Low-Dimensional Parameters With High-Dimensional Data , 2011 .

[9]  Peter Buhlmann Statistical significance in high-dimensional linear models , 2012, 1202.1377.

[10]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[11]  A. Buja,et al.  Valid post-selection inference , 2013, 1306.1059.

[12]  Alexandra Chouldechova,et al.  False Discovery Rate Control for Sequential Selection Procedures, with Application to the Lasso , 2013 .

[13]  Dennis L. Sun,et al.  Exact post-selection inference with the lasso , 2013 .

[14]  Adel Javanmard,et al.  Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models , 2013 .

[15]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[16]  R. Tibshirani,et al.  A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[17]  Adel Javanmard,et al.  Hypothesis Testing in High-Dimensional Regression Under the Gaussian Random Design Model: Asymptotic Theory , 2013, IEEE Transactions on Information Theory.

[18]  Discussion:"A significance test for the lasso" , 2014, 1405.6796.