Reluctant generalized additive modeling

Sparse generalized additive models (GAMs) are an extension of sparse generalized linear models which allow a model's prediction to vary non-linearly with an input variable. This enables the data analyst build more accurate models, especially when the linearity assumption is known to be a poor approximation of reality. Motivated by reluctant interaction modeling (Yu et al. 2019), we propose a multi-stage algorithm, called $\textit{reluctant generalized additive modeling (RGAM)}$, that can fit sparse generalized additive models at scale. It is guided by the principle that, if all else is equal, one should prefer a linear feature over a non-linear feature. Unlike existing methods for sparse GAMs, RGAM can be extended easily to binary, count and survival data. We demonstrate the method's effectiveness on real and simulated examples.

[1]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2]  Johannes Gehrke,et al.  Sparse Partially Linear Additive Models , 2014, ArXiv.

[3]  Jacob Bien,et al.  Reluctant Interaction Modeling , 2019, 1907.08414.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[6]  J. Bien,et al.  Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations , 2015, 1512.01631.

[7]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[8]  Bradley Efron,et al.  Large-scale inference , 2010 .

[9]  Daniela Witten,et al.  Data‐adaptive additive modeling , 2018, Statistics in medicine.

[10]  Ashley Petersen,et al.  Fused Lasso Additive Model , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[11]  B. Efron How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .

[12]  R. Tibshirani,et al.  Additive models with trend filtering , 2017, The Annals of Statistics.

[13]  S. Geer,et al.  High-dimensional additive modeling , 2008, 0806.4115.

[14]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[15]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[16]  T. Hastie,et al.  Generalized Additive Model Selection , 2015, 1506.03850.

[17]  J. Lafferty,et al.  Sparse additive models , 2007, 0711.4555.