Efficient Learning of Label Ranking by Soft Projections onto Polyhedra

We discuss the problem of learning to rank labels from a real valued feedback associated with each label. We cast the feedback as a preferences graph where the nodes of the graph are the labels and edges express preferences over labels. We tackle the learning problem by defining a loss function for comparing a predicted graph with a feedback graph. This loss is materialized by decomposing the feedback graph into bipartite sub-graphs. We then adopt the maximum-margin framework which leads to a quadratic optimization problem with linear constraints. While the size of the problem grows quadratically with the number of the nodes in the feedback graph, we derive a problem of a significantly smaller size and prove that it attains the same minimum. We then describe an efficient algorithm, called SOPOPO, for solving the reduced problem by employing a soft projection onto the polyhedron defined by a reduced set of constraints. We also describe and analyze a wrapper procedure for batch learning when multiple graphs are provided for training. We conclude with a set of experiments which show significant improvements in run time over a state of the art interior-point algorithm.

[1]  Jason Weston,et al.  Kernel methods for Multi-labelled classification and Categ orical regression problems , 2001, NIPS 2001.

[2]  Koby Crammer,et al.  A new family of online algorithms for category ranking , 2002, SIGIR '02.

[3]  Chih-Jen Lin,et al.  A formal analysis of stopping criteria of decomposition methods for support vector machines , 2002, IEEE Trans. Neural Networks.

[4]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[5]  John Platt,et al.  Fast training of svms using sequential minimal optimization , 1998 .

[6]  Yoram Singer,et al.  Log-Linear Models for Label Ranking , 2003, NIPS.

[7]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[8]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[9]  Koby Crammer,et al.  Online Ranking by Projecting , 2005, Neural Computation.

[10]  Shivani Agarwal,et al.  Stability and Generalization of Bipartite Ranking Algorithms , 2005, COLT.

[11]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Glenn Fung,et al.  Learning Rankings via Convex Hull Separation , 2005, NIPS.

[14]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[15]  Clifford Hildreth,et al.  A quadratic programming procedure , 1957 .

[16]  J. Weston,et al.  Support Vector Machines for Multi-class Pattern Recognition 1. K-class Pattern Recognition 2. Solving K-class Problems with Binary Svms , 1999 .

[17]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[18]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[19]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[20]  Andrzej Stachurski,et al.  Parallel Optimization: Theory, Algorithms and Applications , 2000, Scalable Comput. Pract. Exp..

[21]  R. Vanderbei LOQO:an interior point code for quadratic programming , 1999 .

[22]  Gábor Lugosi,et al.  Ranking and Scoring Using Empirical Risk Minimization , 2005, COLT.

[23]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[24]  Cynthia Rudin,et al.  Margin-Based Ranking Meets Boosting in the Middle , 2005, COLT.