A Flexible and Efficient Algorithm for Regularized Fisher Discriminant Analysis

Fisher linear discriminant analysis (LDA) and its kernel extension--kernel discriminant analysis (KDA)--are well known methods that consider dimensionality reduction and classification jointly. While widely deployed in practical problems, there are still unresolved issues surrounding their efficient implementation and their relationship with least mean squared error procedures. In this paper we address these issues within the framework of regularized estimation. Our approach leads to a flexible and efficient implementation of LDA as well as KDA. We also uncover a general relationship between regularized discriminant analysis and ridge regression. This relationship yields variations on conventional LDA based on the pseudoinverse and a direct equivalence to an ordinary least squares estimator. Experimental results on a collection of benchmark data sets demonstrate the effectiveness of our approach.

[1]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[2]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[3]  David G. Stork,et al.  Pattern Classification , 1973 .

[4]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[5]  Haesun Park,et al.  Generalizing discriminant analysis using the generalized singular value decomposition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Volker Roth,et al.  Nonlinear Discriminant Analysis Using Kernel Functions , 1999, NIPS.

[7]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[8]  Gunnar Rätsch,et al.  Invariant Feature Extraction and Classification in Kernel Spaces , 1999, NIPS.

[9]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[10]  Jing-Yu Yang,et al.  Optimal fisher discriminant analysis using the rank decomposition , 1992, Pattern Recognit..

[11]  Haesun Park,et al.  Structure Preserving Dimension Reduction for Clustered Text Data Based on the Generalized Singular Value Decomposition , 2003, SIAM J. Matrix Anal. Appl..

[12]  Gene H. Golub,et al.  Matrix computations , 1983 .

[13]  J. Friedman Regularized Discriminant Analysis , 1989 .

[14]  Hui Xiong,et al.  IDR/QR: an incremental dimension reduction algorithm via QR decomposition , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2006 .

[16]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[17]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[18]  Haesun Park,et al.  A Relationship between Linear Discriminant Analysis and the Generalized Minimum Squared Error Solution , 2005, SIAM J. Matrix Anal. Appl..

[19]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[20]  Josef Kittler,et al.  A new approach to feature selection based on the Karhunen-Loeve expansion , 1973, Pattern Recognit..

[21]  M. Saunders,et al.  Towards a Generalized Singular Value Decomposition , 1981 .

[22]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[23]  Haesun Park,et al.  Nonlinear Discriminant Analysis Using Kernel Functions and the Generalized Singular Value Decomposition , 2005, SIAM J. Matrix Anal. Appl..