Regularisation Paths for Conditional Logistic Regression: the clogitL1 package

We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso [Formula: see text] and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.

[1]  R. Tibshirani,et al.  Strong rules for discarding predictors in lasso‐type problems , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[2]  M. Gail,et al.  Likelihood calculations for matched case-control studies and survival studies with tied death times , 1981 .

[3]  L. V. van't Veer,et al.  Cross‐validated Cox regression on microarray gene expression data , 2006, Statistics in medicine.

[4]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[5]  Trevor Hastie,et al.  A New Algorithm for Matched Case-Control Studies with Applications to Additive Models , 1988 .

[6]  Trevor Hastie,et al.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. , 2011, Journal of statistical software.

[7]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[8]  Yves Grandvalet,et al.  Analysis of multiple exposures in the case‐crossover design via sparse conditional likelihood , 2012, Statistics in medicine.

[9]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[10]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[11]  Martyn Plummer,et al.  A package for statistical analysis in epidemiology , 2016 .

[12]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[13]  N. Breslow,et al.  The analysis of case-control studies , 1980 .