Sample Complexity of Learning Mahalanobis Distance Metrics

Metric learning seeks a transformation of the feature space that enhances prediction quality for the given task at hand. In this work we provide PAC-style sample complexity rates for supervised metric learning. We give matching lower- and upper-bounds showing that the sample complexity scales with the representation dimension when no assumptions are made about the underlying data distribution. However, by leveraging the structure of the data distribution, we show that one can achieve rates that are fine-tuned to a specific notion of intrinsic complexity for a given dataset. Our analysis reveals that augmenting the metric learning optimization criterion with a simple norm-based regularization can help adapt to a dataset's intrinsic complexity, yielding better generalization. Experiments on benchmark datasets validate our analysis and show that regularizing the metric can help discern the signal even when the data contains high amounts of noise.

[1]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[2]  Rong Jin,et al.  Regularized Distance Metric Learning: Theory and Algorithm , 2009, NIPS.

[3]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[4]  Bert Huang,et al.  Learning a Distance Metric from a Network , 2011, NIPS.

[5]  C. Campbell,et al.  Generalization bounds for learning the kernel , 2009 .

[6]  Marc Sebban,et al.  Similarity Learning for Provably Accurate Sparse Linear Classification , 2012, ICML.

[7]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[8]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[9]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[10]  Gert R. G. Lanckriet,et al.  Metric Learning to Rank , 2010, ICML.

[11]  Maria-Florina Balcan,et al.  Improved Guarantees for Learning via Similarity Functions , 2008, COLT.

[12]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[13]  Peter L. Bartlett,et al.  Learning in Neural Networks: Theoretical Foundations , 1999 .

[14]  Gert R. G. Lanckriet,et al.  Robust Structural Metric Learning , 2013, ICML.

[15]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[16]  Mehryar Mohri,et al.  New Generalization Bounds for Learning Kernels , 2009, ArXiv.

[17]  Qiong Cao,et al.  Generalization bounds for metric and similarity learning , 2012, Machine Learning.

[18]  Amaury Habrard,et al.  Robustness and generalization for metric learning , 2012, Neurocomputing.

[19]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[20]  Yiming Ying,et al.  Guaranteed Classification via Regularized Similarity Learning , 2013, Neural Computation.

[21]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[22]  Dacheng Tao,et al.  Learning a Distance Metric by Empirical Loss Minimization , 2011, IJCAI.

[23]  Mehryar Mohri,et al.  Generalization Bounds for Learning Kernels , 2010, ICML.

[24]  Matthieu Cord,et al.  Fantope Regularization in Metric Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.