Heuristics for Improving the Performance of Online SVM Training Algorithms

Recently, the sequential minimal optimization algorithm (SMO) was introduced [1, 2] as an effective method for training support vector machines (SVMs) on classification tasks defined on sparse data sets. SMO differs from most SVM algorithms in that it does not require a quadratic programming (QP) solver. One problem with SMO is that its rate of convergence slows down dramatically when data is non-sparse and when there are many support vectors in the solution. This work addresses this issue by showing how the kernel function outputs can be effectively cached. Caching in SMO is non-trivial because SMO tends to randomly access data points in a training set. The proposed modifications can improve convergence time by orders of magnitude.

[1]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[2]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  N. S. Barnett,et al.  Private communication , 1969 .