Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction

Real-world data are often acquired as a collection of matrices rather than as a single matrix. Such multiblock data are naturally linked and typically share some common features while at the same time exhibiting their own individual features, reflecting the underlying data generation mechanisms. To exploit the linked nature of data, we propose a new framework for common and individual feature extraction (CIFE) which identifies and separates the common and individual features from the multiblock data. Two efficient algorithms termed common orthogonal basis extraction (COBE) are proposed to extract common basis is shared by all data, independent on whether the number of common components is known beforehand. Feature extraction is then performed on the common and individual subspaces separately, by incorporating dimensionality reduction and blind source separation techniques. Comprehensive experimental results on both the synthetic and real-world data demonstrate significant advantages of the proposed CIFE method in comparison with the state-of-the-art.

[1]  Shengli Xie,et al.  Nonnegative Matrix Factorization Applied to Nonlinear Speech and Image Cryptosystems , 2008, IEEE Transactions on Circuits and Systems I: Regular Papers.

[2]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[3]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yuhui Du,et al.  Group information guided ICA for fMRI data analysis , 2013, NeuroImage.

[6]  Vince D. Calhoun,et al.  Joint Blind Source Separation by Multiset Canonical Correlation Analysis , 2009, IEEE Transactions on Signal Processing.

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[9]  Andrzej Cichocki,et al.  Common components analysis via linked blind source separation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Tülay Adalı,et al.  Diversity in Independent Component and Vector Analyses: Identifiability, algorithms, and applications in medical imaging , 2014, IEEE Signal Processing Magazine.

[11]  Eva Ceulemans,et al.  Common and Cluster-Specific Simultaneous Component Analysis , 2013, PloS one.

[12]  Kyuwan Choi,et al.  Detecting the Number of Clusters in n-Way Probabilistic Clustering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yong Xiang,et al.  Projection-Pursuit-Based Method for Blind Separation of Nonnegative Sources , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Feiping Nie,et al.  Improved MinMax Cut Graph Clustering with Nonnegative Relaxation , 2010, ECML/PKDD.

[16]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[17]  J. Kettenring,et al.  Canonical Analysis of Several Sets of Variables , 2022 .

[18]  Wei Liu,et al.  Analysis and Online Realization of the CCA Approach for Blind Source Separation , 2007, IEEE Transactions on Neural Networks.

[19]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Yong Xiang,et al.  Nonnegative Blind Source Separation by Sparse Component Analysis Based on Determinant Measure , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Yong Xiang,et al.  Time-Frequency Approach to Underdetermined Blind Source Separation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Lawrence Carin,et al.  Bayesian joint analysis of heterogeneous genomics data , 2014, Bioinform..

[23]  Tommy Löfstedt,et al.  Global, local and unique decompositions in OnPLS for multiblock data analysis. , 2013, Analytica chimica acta.

[24]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[25]  Eric F Lock,et al.  JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES. , 2011, The annals of applied statistics.

[26]  Ying Guo,et al.  A unified framework for group independent component analysis for multi-subject fMRI data , 2008, NeuroImage.

[27]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[28]  Tommy Löfstedt,et al.  OnPLS—a novel multiblock method for the modelling of predictive and orthogonal variation , 2011 .

[29]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[30]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[31]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[33]  Eric Moulines,et al.  A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..

[34]  Fernando De la Torre,et al.  A Least-Squares Framework for Component Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Sheng Luo,et al.  Population Value Decomposition, a Framework for the Analysis of Image Populations , 2011, Journal of the American Statistical Association.

[36]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[37]  Xingyu Wang,et al.  Frequency Recognition in SSVEP-Based BCI using Multiset Canonical Correlation Analysis , 2013, Int. J. Neural Syst..

[38]  Michael Biehl,et al.  A General Framework for Dimensionality-Reducing Data Visualization Mapping , 2012, Neural Computation.

[39]  Chong-Yung Chi,et al.  Nonnegative Least-Correlated Component Analysis for Separation of Dependent Sources by Volume Maximization , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[41]  Andrzej Cichocki,et al.  Linked Component Analysis From Matrices to High-Order Tensors: Applications to Biomedical Data , 2015, Proceedings of the IEEE.

[42]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[43]  Xudong Jiang,et al.  Eigenfeature Regularization and Extraction in Face Recognition , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Stefan Haufe,et al.  Single-trial analysis and classification of ERP components — A tutorial , 2011, NeuroImage.

[45]  A. Tenenhaus,et al.  Regularized Generalized Canonical Correlation Analysis , 2011, Eur. J. Oper. Res..

[46]  John Shawe-Taylor,et al.  Convergence analysis of kernel Canonical Correlation Analysis: theory and practice , 2008, Machine Learning.

[47]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[48]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[49]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[50]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[51]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[52]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[53]  Tom F. Wilderjans,et al.  Performing DISCO-SCA to search for distinctive and common information in linked data , 2013, Behavior Research Methods.

[54]  Johan Trygg,et al.  O2‐PLS, a two‐block (X–Y) latent variable regression (LVR) method with an integral OSC filter , 2003 .

[55]  Zhaoshui He,et al.  Minimum-Volume-Constrained Nonnegative Matrix Factorization: Enhanced Ability of Learning Parts , 2011, IEEE Transactions on Neural Networks.

[56]  Vince D. Calhoun,et al.  Capturing group variability using IVA: A simulation study and graph-theoretical analysis , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[57]  Andrzej Cichocki,et al.  Fast Nonnegative Matrix/Tensor Factorization Based on Low-Rank Approximation , 2012, IEEE Transactions on Signal Processing.