Fast maximum margin matrix factorization for collaborative prediction

Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, current SDP solvers can only handle MMMF problems on matrices of dimensionality up to a few hundred. Here, we investigate a direct gradient-based optimization method for MMMF and demonstrate it on large collaborative prediction problems. We compare against results obtained by Marlin (2004) and find that MMMF substantially outperforms all nine methods he tested.

[1]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[2]  Michael J. Pazzani,et al.  Learning Collaborative Information Filters , 1998, ICML.

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[5]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[6]  Anna R. Karlin,et al.  Spectral analysis of data , 2001, STOC '01.

[7]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[8]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[9]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[10]  Benjamin M. Marlin,et al.  Collaborative Filtering: A Machine Learning Perspective , 2004 .

[11]  John F. Canny,et al.  GaP: a factor model for discrete data , 2004, SIGIR '04.

[12]  Richard S. Zemel,et al.  The multiple multiplicative factor model for collaborative filtering , 2004, ICML.

[13]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[14]  Tong Zhang,et al.  Text Categorization Based on Regularized Linear Classification Methods , 2001, Information Retrieval.

[15]  Jason D. M. Rennie,et al.  Loss Functions for Preference Levels: Regression with Discrete Ordered Labels , 2005 .

[16]  Adi Shraibman,et al.  Rank, Trace-Norm and Max-Norm , 2005, COLT.