Some methods for heterogeneous treatment effect estimation in high dimensions

When devising a course of treatment for a patient, doctors often have little quantitative evidence on which to base their decisions, beyond their medical education and published clinical trials. Stanford Health Care alone has millions of electronic medical records that are only just recently being leveraged to inform better treatment recommendations. These data present a unique challenge because they are high dimensional and observational. Our goal is to make personalized treatment recommendations based on the outcomes for past patients similar to a new patient. We propose and analyze 3 methods for estimating heterogeneous treatment effects using observational data. Our methods perform well in simulations using a wide variety of treatment effect functions, and we present results of applying the 2 most promising methods to data from The SPRINT Data Analysis Challenge, from a large randomized trial of a treatment for high blood pressure.

[1]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[2]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[3]  M. Gail,et al.  Testing for qualitative interactions between treatment effects and patient subsets. , 1985, Biometrics.

[4]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[5]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[6]  K. Anderson,et al.  Cardiovascular disease risk profiles. , 1991, American heart journal.

[7]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[8]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[9]  M. J. Ashby,et al.  Optimisation of antihypertensive treatment by crossover rotation of four major classes , 1999, The Lancet.

[10]  M. Leblanc AN ADAPTIVE EXPANSION METHOD FOR REGRESSION , 1999 .

[11]  P. Gustafson Bayesian Regression Modeling with Interactions and Smooth Effects , 2000 .

[12]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[13]  Robert Tibshirani,et al.  The Elements of Statistical Learning , 2001 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  M. Bonetti,et al.  Patterns of treatment effects in subsets of patients in clinical trials. , 2004, Biostatistics.

[16]  Patrick Royston,et al.  Detecting an interaction between treatment and a continuous covariate: A comparison of two approaches , 2007, Comput. Stat. Data Anal..

[17]  Richard K. Crump,et al.  Nonparametric Tests for Treatment Effect Heterogeneity , 2006, The Review of Economics and Statistics.

[18]  Hansheng Wang,et al.  Subgroup Analysis via Recursive Partitioning , 2009, J. Mach. Learn. Res..

[19]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[20]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[21]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[22]  P. Austin An introduction to propensity-score methods for reducing confounding in observational studies , 2011 .

[23]  D. Ghosh,et al.  On Bayesian methods of exploring qualitative interactions for targeted treatment , 2012, Statistics in medicine.

[24]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[25]  D. Green,et al.  Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees , 2012 .

[26]  J. D. Malley,et al.  Probability Machines , 2011, Methods of Information in Medicine.

[27]  Yu Xie,et al.  Estimating Heterogeneous Treatment Effects with Observational Data , 2012, Sociological methodology.

[28]  Marc Ratkovic,et al.  Estimating treatment effect heterogeneity in randomized program evaluation , 2013, 1305.5682.

[29]  Lu Tian,et al.  A Simple Method for Detecting Interactions between a Treatment and a Large Number of Covariates , 2012, 1212.2995.

[30]  Matt Taddy,et al.  Heterogeneous Treatment Effects in Digital Experimentation , 2014, 1412.8563.

[31]  Jackson T. Wright,et al.  A Randomized Trial of Intensive versus Standard Blood-Pressure Control. , 2016, The New England journal of medicine.

[32]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[33]  Julie Tibshirani,et al.  Solving Heterogeneous Estimating Equations with Gradient Forests , 2016 .

[34]  Yen S. Low,et al.  Comparing high-dimensional confounder control methods for rapid cohort studies from electronic health records , 2015, Journal of comparative effectiveness research.

[35]  C. Escobar Cervantes,et al.  [A randomized trial of intensive versus standard blood pressure control]. , 2016, Semergen.

[36]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[37]  Benjamin M. Taylor,et al.  spatsurv:an R package for Bayesian inference with spatial survival models , 2017 .

[38]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.