L1 regularized projection pursuit for additive model learning

In this paper, we present a L1 regularized projection pursuit algorithm for additive model learning. Two new algorithms are developed for regression and classification respectively: sparse projection pursuit regression and sparse Jensen-Shannon Boosting. The introduced L1 regularized projection pursuit encourages sparse solutions, thus our new algorithms are robust to overfitting and present better generalization ability especially in settings with many irrelevant input features and noisy data. To make the optimization with L1 regularization more efficient, we develop an ldquoinformative feature firstrdquo sequential optimization algorithm. Extensive experiments demonstrate the effectiveness of our proposed approach.

[1]  Harry Shum,et al.  Kullback-Leibler boosting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  Javier Ruiz-del-Solar,et al.  Gender Classification of Faces Using Adaboost , 2006, CIARP.

[3]  Larry A. Wasserman,et al.  SpAM: Sparse Additive Models , 2007, NIPS.

[4]  Stan Z. Li,et al.  Jensen-Shannon boosting learning for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Harry Shum,et al.  PicToon: a personalized image-based cartoon system , 2002, MULTIMEDIA '02.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[8]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[9]  Harry Shum,et al.  Example-based caricature generation with exaggeration , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..

[10]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[11]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[12]  Chen Hong A Personalized Image-Based Cartoon System , 2002 .

[13]  A. Martínez,et al.  The AR face databasae , 1998 .

[14]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[15]  Takayuki Fujiwara,et al.  On KANSEI facial image processing for computerized facial caricaturing system PICASSO , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[16]  Amit Jain,et al.  Integrating independent components and linear discriminant analysis for gender classification , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[17]  Balázs Kégl,et al.  Boosting on Manifolds: Adaptive Regularization of Base Classifiers , 2004, NIPS.

[18]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[19]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Yongdai Kim,et al.  Gradient LASSO for feature selection , 2004, ICML.

[21]  Michael F. Cohen,et al.  Sample Based Face Caricature Generation , 2004 .

[22]  Susan E. Brennan,et al.  From the Leonardo Archive , 2007, Leonardo.

[23]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[24]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[25]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[26]  Erik Reinhard,et al.  Human facial illustrations: Creation and psychophysical evaluation , 2004, TOGS.

[27]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[28]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[29]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[30]  Ming-Hsuan Yang,et al.  Learning Gender with Support Faces , 2002, IEEE Trans. Pattern Anal. Mach. Intell..