Trace Ratio Optimization With Feature Correlation Mining for Multiclass Discriminant Analysis

Fisher’s linear discriminant analysis is a widely accepted dimensionality reduction method, which aims to find a transformation matrix to convert feature space to a smaller space by maximising the between-class scatter matrix while minimising the within-class scatter matrix. Although the fast and easy process of finding the transformation matrix has made this method attractive, overemphasizing the large class distances makes the criterion of this method suboptimal. In this case, the close class pairs tend to overlap in the subspace. Despite different weighting methods having been developed to overcome this problem, there is still a room to improve this issue. In this work, we study a weighted trace ratio by maximising the harmonic mean of the multiple objective reciprocals. To further improve the performance, we enforce the 2,1-norm to the developed objective function. Additionally, we propose an iterative algorithm to optimise this objective function. The proposed method avoids the domination problem of the largest objective, and guarantees that no objectives will be too small. This method can be more beneficial if the number of classes is large. The extensive experiments on different datasets show the effectiveness of our proposed method when compared with four state-of-the-art methods.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[3]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[4]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[5]  M. Loog Approximate Pairwise Accuracy Criteria for Multiclass Linear Dimension Reduction: Generalisations of the Fisher Criterion , 1999 .

[6]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[7]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[11]  D. Edwards Data Mining: Concepts, Models, Methods, and Algorithms , 2003 .

[12]  R. Abseher,et al.  Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[13]  Dacheng Tao,et al.  Harmonic mean for subspace selection , 2008, 2008 19th International Conference on Pattern Recognition.

[14]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[15]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[17]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[18]  Dacheng Tao,et al.  Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Nong Sang,et al.  Fractional-step max-min distance analysis for dimension reduction , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[20]  Yi Yang,et al.  A Convex Formulation for Semi-Supervised Multi-Label Feature Selection , 2014, AAAI.

[21]  Jun Guo,et al.  Max-K-Min Distance Analysis for Dimension Reduction , 2014, 2014 22nd International Conference on Pattern Recognition.

[22]  Feiping Nie,et al.  Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK) , 2014, ECML/PKDD.

[23]  Verónica Bolón-Canedo,et al.  Novel feature selection methods for high dimensional data , 2014 .

[24]  Ying Wu,et al.  Heteroscedastic max-min distance analysis , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Nong Sang,et al.  Regularized max-min linear discriminant analysis , 2017, Pattern Recognit..

[26]  Zhihui Li,et al.  Beyond Trace Ratio: Weighted Harmonic Mean of Trace Ratios for Multiclass Discriminant Analysis , 2017, IEEE Transactions on Knowledge and Data Engineering.

[27]  Xiaojun Chang,et al.  Semisupervised Feature Analysis by Mining Correlations Among Multiple Tasks , 2014, IEEE Transactions on Neural Networks and Learning Systems.