Kernel Alignment Inspired Linear Discriminant Analysis

Kernel alignment measures the degree of similarity between two kernels. In this paper, inspired from kernel alignment, we propose a new Linear Discriminant Analysis (LDA) formulation, kernel alignment LDA (kaLDA). We first define two kernels, data kernel and class indicator kernel. The problem is to find a subspace to maximize the alignment between subspace-transformed data kernel and class indicator kernel. Surprisingly, the kernel alignment induced kaLDA objective function is very similar to classical LDA and can be expressed using between-class and total scatter matrices. This can be extended to multi-label data. We use a Stiefel-manifold gradient descent algorithm to solve this problem. We perform experiments on 8 single-label and 6 multi-label data sets. Results show that kaLDA has very good performance on many single-label and multi-label problems.

[1]  Edward Y. Chang,et al.  Learning the unified kernel machines for classification , 2006, KDD '06.

[2]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[3]  Neil D. Lawrence,et al.  Advances in Neural Information Processing Systems 14 , 2002 .

[4]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[5]  Jieping Ye,et al.  Discriminant Analysis for Dimensionality Reduction: An Overview of Recent Developments , 2010 .

[6]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[7]  Yong Jae Lee,et al.  Foreground Focus: Unsupervised Learning from Partially Matching Images , 2009, International Journal of Computer Vision.

[8]  Jieping Ye,et al.  Extracting shared subspace for multi-label classification , 2008, KDD.

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Trevor Hastie,et al.  Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.

[11]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[12]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[13]  Chris H. Q. Ding,et al.  A Semi-definite Positive Linear Discriminant Analysis and Its Applications , 2012, 2012 IEEE 12th International Conference on Data Mining.

[14]  Volker Tresp,et al.  Multi-label informed latent semantic indexing , 2005, SIGIR '05.

[15]  Tony Jebara,et al.  Transformation Learning Via Kernel Alignment , 2009, 2009 International Conference on Machine Learning and Applications.

[16]  J. B. Rosen,et al.  Lower Dimensional Representation of Text Data Based on Centroids and Least Squares , 2003 .

[17]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[18]  Zhi-Hua Zhou,et al.  Multilabel dimensionality reduction via dependence maximization , 2008, TKDD.

[19]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression (PIE) database , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[20]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[21]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[22]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[23]  Marco Cuturi,et al.  Fast Global Alignment Kernels , 2011, ICML.

[24]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[25]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[26]  Dong Xu,et al.  Trace Ratio vs. Ratio Trace for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Chris H. Q. Ding,et al.  Multi-label Linear Discriminant Analysis , 2010, ECCV.

[28]  Zoubin Ghahramani,et al.  Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning , 2004, NIPS.