论文信息 - Online and batch learning of pseudo-metrics

Online and batch learning of pseudo-metrics

We describe and analyze an online algorithm for supervised learning of pseudo-metrics. The algorithm receives pairs of instances and predicts their similarity according to a pseudo-metric. The pseudo-metrics we use are quadratic forms parameterized by positive semi-definite matrices. The core of the algorithm is an update rule that is based on successive projections onto the positive semi-definite cone and onto half-space constraints imposed by the examples. We describe an efficient procedure for performing these projections, derive a worst case mistake bound on the similarity predictions, and discuss a dual version of the algorithm in which it is simple to incorporate kernel operators. The online algorithm also serves as a building block for deriving a large-margin batch algorithm. We demonstrate the merits of the proposed approach by conducting experiments on MNIST dataset and on document filtering.

[1] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2] J. H. Wilkinson. The algebraic eigenvalue problem , 1966 .

[3] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[4] David G. Stork,et al. Pattern Classification , 1973 .

[5] Chris Buckley,et al. Pivoted document length normalization , 1996, SIGIR '96.

[6] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[7] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[8] Andrzej Stachurski,et al. Parallel Optimization: Theory, Algorithms and Applications , 2000, Scalable Comput. Pract. Exp..

[9] Mark Herbster,et al. Learning Additive Models Online with Fast Evaluating Kernels , 2001, COLT/EuroCOLT.

[10] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.

[11] John Shawe-Taylor,et al. The Perceptron Algorithm with Uneven Margins , 2002, ICML.

[12] Misha Pavel,et al. Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[13] Michael I. Jordan,et al. Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[14] Koby Crammer,et al. Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[15] Pat Langley,et al. Editorial: On Machine Learning , 1986, Machine Learning.

[16] Amos Storkey,et al. Advances in Neural Information Processing Systems 20 , 2007 .