A Simple Method for Detecting Interactions between a Treatment and a Large Number of Covariates

We consider a setting in which we have a treatment and a potentially large number of covariates for a set of observations, and wish to model their relationship with an outcome of interest. We propose a simple method for modeling interactions between the treatment and covariates. The idea is to modify the covariate in a simple way, and then fit a standard model using the modified covariates and no main effects. We show that coupled with an efficiency augmentation procedure, this method produces clinically meaningful estimators in a variety of settings. It can be useful for practicing personalized medicine: determining from a large set of biomarkers, the subset of patients that can potentially benefit from a treatment. We apply the method to both simulated datasets and real trial data. The modified covariates idea can be used for other purposes, for example, large scale hypothesis testing for determining which of a set of covariates interact with a treatment variable. Supplementary materials for this article are available online.

[1]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[2]  L. J. Wei,et al.  The Robust Inference for the Cox Proportional Hazards Model , 1989 .

[3]  J. Friedman Multivariate adaptive regression splines , 1990 .

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[6]  M. Leblanc AN ADAPTIVE EXPANSION METHOD FOR REGRESSION , 1999 .

[7]  P. Gustafson Bayesian Regression Modeling with Interactions and Smooth Effects , 2000 .

[8]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[10]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  M. Bonetti,et al.  Patterns of treatment effects in subsets of patients in clinical trials. , 2004, Biostatistics.

[12]  M. Pfeffer,et al.  Angiotensin-converting-enzyme inhibition in stable coronary artery disease. , 2004, The New England journal of medicine.

[13]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[14]  Javier M. Moguerza,et al.  Support Vector Machines with Applications , 2006, math/0612817.

[15]  S. Solomon,et al.  Renal Function and Effectiveness of Angiotensin-Converting Enzyme Inhibitor Therapy in Patients With Chronic Stable Coronary Disease in the Prevention of Events with ACE inhibition (PEACE) Trial , 2006, Circulation.

[16]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[17]  T. Hastie,et al.  Comment on "Support Vector Machines with Applications" , 2006, math/0612824.

[18]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[19]  Patrick Royston,et al.  Detecting an interaction between treatment and a continuous covariate: A comparison of two approaches , 2007, Comput. Stat. Data Anal..

[20]  J. Bergh,et al.  Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[21]  Jian Huang,et al.  Asymptotic oracle properties of SCAD-penalized least squares estimators , 2007, 0709.0863.

[22]  Xiaogang Su,et al.  Interaction Trees with Censored Survival Data , 2008, The international journal of biostatistics.

[23]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[24]  P. Royston,et al.  Interactions between treatment and continuous covariates: a step toward individualizing therapy. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[25]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[26]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[27]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[28]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[29]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[30]  S. Murphy,et al.  PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.

[31]  Tong Zhang,et al.  A General Theory of Concave Regularization for High-Dimensional Sparse Estimation Problems , 2011, 1108.4988.

[32]  Lu Tian,et al.  Adaptive index models for marker-based risk stratification. , 2011, Biostatistics.

[33]  D. Ghosh,et al.  On Bayesian methods of exploring qualitative interactions for targeted treatment , 2012, Statistics in medicine.

[34]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[35]  Cun-Hui Zhang,et al.  ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL. , 2013, Annals of statistics.

[36]  Po-Ling Loh,et al.  Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..

[37]  Guodong Guo,et al.  Support Vector Machines Applications , 2014 .

[38]  S. Kong,et al.  Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso. , 2012, Statistica Sinica.