Learning Permutations with Exponential Weights

We give an algorithm for learning a permutation on-line. The algorithm maintains its uncertainty about the target permutation as a doubly stochastic matrix. This matrix is updated by multiplying the current matrix entries by exponential factors. These factors destroy the doubly stochastic property of the matrix and an iterative procedure is needed to re-normalize the rows and columns. Even though the result of the normalization procedure does not have a closed form, we can still bound the additional loss of our algorithm over the loss of the best permutation chosen in hindsight.

[1]  Richard Sinkhorn A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[2]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[3]  Hanif D. Sherali,et al.  Linear Programming and Network Flows , 1977 .

[4]  Y. Censor,et al.  An iterative row-action method for interval convex programming , 1981 .

[5]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[6]  J. Lorenz,et al.  On the scaling of multidimensional matrices , 1989 .

[7]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[8]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[9]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[10]  Robert E. Schapire,et al.  Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.

[11]  Manfred K. Warmuth,et al.  Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.

[12]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[13]  L. Khachiyan,et al.  ON THE COMPLEXITY OF NONNEGATIVE-MATRIX SCALING , 1996 .

[14]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[15]  Alex Samorodnitsky,et al.  A Deterministic Strongly Polynomial Algorithm for Matrix Scaling and Approximate Permanents , 1998, STOC '98.

[16]  Manfred K. Warmuth,et al.  Averaging Expert Predictions , 1999, EuroCOLT.

[17]  Manfred K. Warmuth,et al.  Relative loss bounds for single neurons , 1999, IEEE Trans. Neural Networks.

[18]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[19]  Mark Herbster,et al.  Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..

[20]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[21]  Adam Tauman Kalai,et al.  Static Optimality and Dynamic Search-Optimality in Lists and Trees , 2002, SODA '02.

[22]  Static optimality and dynamic search-optimality in lists and trees , 2002, SODA 2002.

[23]  Manfred K. Warmuth,et al.  Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..

[24]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[25]  H. Balakrishnan,et al.  Polynomial approximation algorithms for belief matrix maintenance in identity management , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[26]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries , 2004, JACM.

[27]  David A. McAllester PAC-Bayesian Stochastic Model Selection , 2003, Machine Learning.

[28]  Martin Fürer Quadratic Convergence for Scaling of Matrices , 2004, ALENEX/ANALC.

[29]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[30]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[31]  S. V. N. Vishwanathan,et al.  Leaving the Span , 2005, COLT.

[32]  Manfred K. Warmuth,et al.  Optimum Follow the Leader Algorithm , 2005, COLT.

[33]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[34]  Manfred K. Warmuth,et al.  Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension , 2006, NIPS.

[35]  Geoffrey J. Gordon No-regret Algorithms for Online Convex Programs , 2006, NIPS.

[36]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[37]  Manfred K. Warmuth,et al.  Online kernel PCA with entropic matrix updates , 2007, ICML '07.

[38]  Tony Jebara,et al.  Multi-object tracking with representations of the symmetric group , 2007, AISTATS.

[39]  Manfred K. Warmuth,et al.  Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension , 2008 .

[40]  Lance Fortnow,et al.  Complexity of combinatorial market makers , 2008, EC '08.

[41]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[42]  Leonidas J. Guibas,et al.  Fourier Theoretic Probabilistic Inference over Permutations , 2009, J. Mach. Learn. Res..