论文信息 - Large-Scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex - 字舞流文

Large-Scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex

Dual decomposition methods are the current state-of-the-art for training multiclass formulations of Support Vector Machines (SVMs). At every iteration, dual decomposition methods update a small subset of dual variables by solving a restricted optimization problem. In this paper, we propose an exact and efficient method for solving the restricted problem. In our method, the restricted problem is reduced to the well-known problem of Euclidean projection onto the positive simplex, which we can solve exactly in expected O(k) time, where k is the number of classes. We demonstrate that our method empirically achieves state-of-the-art convergence on several large-scale high-dimensional datasets.

Naonori Ueda | Akinori Fujino | Mathieu Blondel | Mathieu Blondel | N. Ueda | Akinori Fujino | A. Fujino

[1] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[2] Chih-Jen Lin,et al. A sequential dual method for large scale multi-class linear svms , 2008, KDD.

[3] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[4] S. Sundararajan,et al. A Sequential Dual Method for Structural SVMs , 2011, SDM.

[5] Jun Liu,et al. Efficient Euclidean projections in linear time , 2009, ICML '09.

[6] Koby Crammer,et al. Confidence-weighted linear classification , 2008, ICML '08.

[7] Naonori Ueda,et al. Online Passive-Aggressive Algorithms for Non-Negative Matrix Factorization and Completion , 2014, AISTATS.

[8] Koby Crammer,et al. On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[9] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[10] Yi Lin. Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[11] Jason Weston,et al. Support vector machines for multi-class pattern recognition , 1999, ESANN.

[12] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[13] Thorsten Joachims,et al. Cutting-plane training of structural SVMs , 2009, Machine Learning.

[14] Yoram Singer,et al. Efficient Learning of Label Ranking by Soft Projections onto Polyhedra , 2006, J. Mach. Learn. Res..

[15] Peter L. Bartlett,et al. Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks , 2008, J. Mach. Learn. Res..

[16] Kilian Q. Weinberger,et al. Feature hashing for large scale multitask learning , 2009, ICML '09.

[17] Yoram Singer,et al. Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[18] Mark W. Schmidt,et al. Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[19] Chih-Jen Lin,et al. A formal analysis of stopping criteria of decomposition methods for support vector machines , 2002, IEEE Trans. Neural Networks.

[20] Kazuhiro Seki,et al. Block coordinate descent algorithms for large-scale sparse multiclass classification , 2013, Machine Learning.

[21] Jason Weston,et al. Solving multiclass support vector machines with LaRank , 2007, ICML '07.