Smooth ε-insensitive regression by loss symmetrization

We describe a framework for solving regression problems by reduction to classification. Our reduction is based on symmetrization of margin-based loss functions commonly used in boosting algorithms, namely, the logistic loss and the exponential loss. Our construction yields a smooth version of the e-insensitive hinge loss that is used in support vector regression. A byproduct of this construction is a new simple form of regularization for boosting-based classification and regression algorithms. We present two parametric families of batch learning algorithms for minimizing these losses. The first family employs a log-additive update and is based on recent boosting algorithms while the second family uses a new form of additive update. We also describe and analyze online gradient descent (GD) and exponentiated gradient (EG) algorithms for the e-insensitive logistic loss. Our regression framework also has implications on classification algorithms, namely, a new additive batch algorithm for the log-loss and exp-loss used in boosting.

[1]  A. G. Fisher,et al.  Generalized body composition prediction equations for men using simple measurement techniques , 1985 .

[2]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[3]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[4]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[5]  Nicolò Cesa-Bianchi,et al.  Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[6]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[7]  Manfred K. Warmuth,et al.  Relative loss bounds for single neurons , 1999, IEEE Trans. Neural Networks.

[8]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  David P. Helmbold,et al.  Leveraging for Regression , 2000, COLT.

[11]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[12]  John D. Lafferty,et al.  Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[13]  Yoav Freund,et al.  Drifting Games and Brownian Motion , 2002, J. Comput. Syst. Sci..

[14]  Robert E. Schapire,et al.  Incorporating Prior Knowledge into Boosting , 2002, ICML.

[15]  Jinbo Bi,et al.  A geometric approach to support vector regression , 2003, Neurocomputing.

[16]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[17]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[18]  Robert E. Schapire,et al.  Drifting Games , 1999, COLT '99.

[19]  B. Ripley,et al.  Robust Statistics , 2018, Wiley Series in Probability and Statistics.

[20]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.