Pairwise-Covariance Linear Discriminant Analysis

In machine learning, linear discriminant analysis (LDA) is a popular dimension reduction method. In this paper, we first provide a new perspective of LDA from an information theory perspective. From this new perspective, we propose a new formulation of LDA, which uses the pairwise averaged class covariance instead of the globally averaged class covariance used in standard LDA. This pairwise (averaged) covariance describes data distribution more accurately. The new perspective also provides a natural way to properly weigh different pairwise distances, which emphasizes the pairs of class with small distances, and this leads to the proposed pairwise covariance properly weighted LDA (pcLDA). The kernel version of pcLDA is presented to handle nonlinear projections. Efficient algorithms are presented to efficiently compute the proposed models.

[1]  Delin Chu,et al.  Sparse Uncorrelated Linear Discriminant Analysis , 2013, ICML.

[2]  Jieping Ye,et al.  Efficient Kernel Discriminant Analysis via QR Decomposition , 2004, NIPS.

[3]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[4]  Jieping Ye,et al.  Discriminant Analysis for Dimensionality Reduction: An Overview of Recent Developments , 2010 .

[5]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[6]  Masashi Sugiyama,et al.  Local Fisher discriminant analysis for supervised dimensionality reduction , 2006, ICML.

[7]  Hua Li,et al.  IMMC: incremental maximum margin criterion , 2004, KDD.

[8]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[9]  Yong Jae Lee,et al.  Foreground Focus: Unsupervised Learning from Partially Matching Images , 2009, International Journal of Computer Vision.

[10]  Guanhua Yan,et al.  Discriminant malware distance learning on structural information for automated malware classification , 2013, SIGMETRICS.

[11]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[12]  Chris H. Q. Ding,et al.  A Semi-definite Positive Linear Discriminant Analysis and Its Applications , 2012, 2012 IEEE 12th International Conference on Data Mining.

[13]  Chris H. Q. Ding,et al.  Linear Discriminant Analysis: New Formulations and Overfit Analysis , 2011, AAAI.

[14]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[15]  Chris H. Q. Ding,et al.  Multi-label ReliefF and F-statistic feature selections for image annotation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Ali Ghodsi,et al.  Distance metric learning vs. Fisher discriminant analysis , 2008, AAAI 2008.

[17]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Maya R. Gupta,et al.  Dimensionality Reduction by Local Discriminative Gaussians , 2012, ICML.

[19]  Dong Xu,et al.  Trace Ratio vs. Ratio Trace for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[21]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[22]  Feiping Nie,et al.  Learning a Mahalanobis distance metric for data clustering and classification , 2008, Pattern Recognit..

[23]  J. B. Rosen,et al.  Lower Dimensional Representation of Text Data Based on Centroids and Least Squares , 2003 .