Sequential selection procedures and false discovery rate control

We consider a multiple‐hypothesis testing setting where the hypotheses are ordered and one is only permitted to reject an initial contiguous block H1,…,Hk of hypotheses. A rejection rule in this setting amounts to a procedure for choosing the stopping point k. This setting is inspired by the sequential nature of many model selection problems, where choosing a stopping point or a model is equivalent to rejecting all hypotheses up to that point and none thereafter. We propose two new testing procedures and prove that they control the false discovery rate in the ordered testing setting. We also show how the methods can be applied to model selection by using recent results on p‐values in sequential model selection settings.

[1]  A. Rényi On the theory of order statistics , 1953 .

[2]  H. Akaike A new look at the statistical model identification , 1974 .

[3]  K. Gabriel,et al.  On closed testing procedures with special reference to ordered analysis of variance , 1976 .

[4]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[5]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[6]  R. Simes,et al.  An improved Bonferroni procedure for multiple tests of significance , 1986 .

[7]  H. Gish,et al.  the Significance Test , 1989 .

[8]  S. S. Young,et al.  Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[9]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[10]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[13]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[14]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[15]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[16]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[17]  K. Simonsen,et al.  Using Alpha Wisely: Improving Power to Detect Multiple QTL , 2004, Statistical applications in genetics and molecular biology.

[18]  Tommy F. Liu,et al.  HIV-1 Protease and reverse-transcriptase mutations: correlations with antiretroviral therapy in subtype B isolates and implications for drug-resistance surveillance. , 2005, The Journal of infectious diseases.

[19]  Joseph P. Romano,et al.  Stepup procedures for control of generalizations of the familywise error rate , 2006, math/0611266.

[20]  R. Shafer,et al.  Genotypic predictors of human immunodeficiency virus type 1 drug resistance , 2006, Proceedings of the National Academy of Sciences.

[21]  L. Stefanski,et al.  Approved by: Project Leader Approved by: LCG Project Leader Prepared by: Project Manager Prepared by: LCG Project Manager Reviewed by: Quality Assurance Manager , 2004 .

[22]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[23]  G. Blanchard,et al.  Two simple sufficient conditions for FDR control , 2008, 0802.1406.

[24]  Wei Shao,et al.  Frequent polymorphism at drug resistance sites in HIV-1 protease and reverse transcriptase , 2008, AIDS.

[25]  Dean P. Foster,et al.  α‐investing: a procedure for sequential control of expected false discoveries , 2008 .

[26]  Y. Benjamini,et al.  A simple forward selection procedure based on false discovery rate control , 2009, 0905.2819.

[27]  Dean P. Foster,et al.  VIF Regression: A Fast Regression Algorithm for Large Data , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[28]  J. Goeman,et al.  The Sequential Rejection Principle of Familywise Error Control , 2010, 1211.3313.

[29]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[30]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[31]  Rajen Dinesh Shah,et al.  Variable selection with error control: another look at stability selection , 2011, 1105.5578.

[32]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[33]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[34]  R. Tibshirani,et al.  Adaptive testing for the graphical lasso , 2013, 1307.4765.

[35]  Joshua R. Loftus,et al.  Inference in adaptive regression via the Kac–Rice formula , 2013, 1308.3020.

[36]  Dennis L. Sun,et al.  Exact inference after model selection via the Lasso , 2013 .

[37]  Dennis L. Sun,et al.  Exact post-selection inference with the lasso , 2013 .

[38]  R. Tibshirani,et al.  False Variable Selection Rates in Regression , 2013 .

[39]  R. Tibshirani,et al.  Exact Post-Selection Inference for Sequential Regression Procedures , 2014, 1401.3889.

[40]  S. Rosset,et al.  Generalized α‐investing: definitions, optimality results and application to public databases , 2014 .

[41]  Dennis L. Sun,et al.  Optimal Inference After Model Selection , 2014, 1410.2597.

[42]  R. Tibshirani,et al.  A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[43]  Joshua R. Loftus,et al.  A significance test for forward stepwise model selection , 2014, 1405.3920.

[44]  Jonathan E. Taylor,et al.  Exact Post Model Selection Inference for Marginal Screening , 2014, NIPS.

[45]  Robert Tibshirani,et al.  Post-selection adaptive inference for Least Angle Regression and the Lasso , 2014 .

[46]  Weijie J. Su,et al.  SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION. , 2014, The annals of applied statistics.

[47]  E. Candès,et al.  Controlling the false discovery rate via knockoffs , 2014, 1404.5609.