An ℓ2/ℓ1 regularization framework for diverse learning tasks

Regularization plays an important role in learning tasks, to incorporate prior knowledge about a problem and thus improve learning performance. Well known regularization methods, including ?2 and ?1 regularization, have shown great success in a variety of conventional learning tasks, and new types of regularization have also been developed to deal with modern problems, such as multi-task learning. In this paper, we introduce the ? 2 / ? 1 regularization for diverse learning tasks. The ? 2 / ? 1 regularization is a mixed norm defined over the parameters of the diverse learning tasks. It adaptively encourages the diversity of features among diverse learning tasks, i.e., when a feature is responsible for some tasks it is unlikely to be responsible for the rest of the tasks. We consider two applications of the ? 2 / ? 1 regularization framework, i.e., learning sparse self-representation of a dataset for clustering and learning one-vs.-rest binary classifiers for multi-class classification, both of which confirm the effectiveness of the new regularization framework over benchmark datasets. HighlightsWe present a new ? 2 / ? 1 regularization framework for learning diverse tasks.We derive an efficient first-order algorithm for solving the problem.Applications of the new regularization framework are provided.

[1]  Robert D. Nowak,et al.  Majorization–Minimization Algorithms for Wavelet-Based Image Restoration , 2007, IEEE Transactions on Image Processing.

[2]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[3]  D. B. Graham,et al.  Characterising Virtual Eigensignatures for General Purpose Face Recognition , 1998 .

[4]  TorralbaAntonio,et al.  Modeling the Shape of the Scene , 2001 .

[5]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[6]  J KriegmanDavid,et al.  Eigenfaces vs. Fisherfaces , 1997 .

[7]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[9]  David J. Kriegman,et al.  Recognition using class specific linear projection , 1997 .

[10]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[11]  M. Kowalski Sparse regression using mixed norms , 2009 .

[12]  Alexandre d'Aspremont,et al.  First-Order Methods for Sparse Covariance Selection , 2006, SIAM J. Matrix Anal. Appl..

[13]  Dacheng Tao,et al.  Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[15]  M. Wainwright,et al.  Joint support recovery under high-dimensional scaling: Benefits and perils of ℓ 1,∞ -regularization , 2008, NIPS 2008.

[16]  Charles L. Byrne,et al.  An Elementary Proof of Convergence for the Forward-Backward Splitting Algorithm , 2013 .

[17]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[19]  Dacheng Tao,et al.  Entropy controlled Laplacian regularization for least square regression , 2010, Signal Process..

[20]  Vahid Tarokh,et al.  Adaptive algorithms for sparse system identification , 2011, Signal Process..