Multilinear and nonlinear generalizations of partial least squares: an overview of recent advances

Partial least squares (PLS) is an efficient multivariate statistical regression technique that has shown to be particularly useful for analysis of highly collinear data. To predict response variables Y based independent variables X, PLS attempts to find a set of common orthogonal latent variables by projecting both X and Y onto a new subspace respectively. As an increasing interest in multi‐way analysis, the extension to multilinear regression model is also developed with the aim to analyzing two‐multidimensional tensor data. In this article, we overview the PLS‐related methods including linear, multilinear, and nonlinear variants and discuss the strength of the algorithms. As canonical correlation analysis (CCA) is another similar technique with the aim to extract the most correlated latent components between two datasets, we also briefly discuss the extension of CCA to tensor space. Finally, several examples are given to compare these methods with respect to the regression and classification techniques.

[1]  Samuel Kaski,et al.  Variational Bayesian Mixture of Robust CCA Models , 2010, ECML/PKDD.

[2]  Francis R. Bach,et al.  Sparse probabilistic projections , 2008, NIPS.

[3]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[4]  Natasa Kovacevic,et al.  Groupwise independent component decomposition of EEG data and partial least square analysis , 2007, NeuroImage.

[5]  Rolf Ergon,et al.  PLS score–loading correspondence and a bi‐orthogonal factorization , 2002 .

[6]  C. Mehring,et al.  Encoding of Movement Direction in Different Frequency Ranges of Motor Cortical Local Field Potentials , 2005, The Journal of Neuroscience.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[9]  K. Helland,et al.  Recursive algorithm for partial least squares regression , 1992 .

[10]  Samuel Kaski,et al.  Bayesian Canonical correlation analysis , 2013, J. Mach. Learn. Res..

[11]  H. Wold Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least Squares (NIPALS) Approach , 1975, Journal of Applied Probability.

[12]  Tanaya Guha,et al.  Learning Sparse Representations for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Sethu Vijayakumar,et al.  Proceedings of the twenty-first international conference on Machine learning , 2004 .

[14]  VandewalleJoos,et al.  On the Best Rank-1 and Rank-(R1,R2,. . .,RN) Approximation of Higher-Order Tensors , 2000 .

[15]  L.J. Trejo,et al.  Brain-computer interfaces for 1-D and 2-D cursor control: designs using volitional control of the EEG spectrum or steady-state visual evoked potentials , 2006, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[16]  Johan A. K. Suykens,et al.  Classification of Multichannel Signals With Cumulant-Based Kernels , 2012, IEEE Transactions on Signal Processing.

[17]  Colin Fyfe,et al.  Stochastic Processes for Canonical Correlation Analysis , 2006, ESANN.

[18]  Anthony Randal McIntosh,et al.  Partial Least Squares (PLS) methods for neuroimaging: A tutorial and review , 2011, NeuroImage.

[19]  Anthony Randal McIntosh,et al.  Partial least squares analysis of neuroimaging data: applications and advances , 2004, NeuroImage.

[20]  J. Shawe-Taylor,et al.  Multi-View Canonical Correlation Analysis , 2010 .

[21]  Tae-Kyun Kim,et al.  Learning Motion Categories using both Semantic and Structural Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  J. Ross Beveridge,et al.  Action classification on product manifolds , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Thomas B. Moeslund,et al.  A Local 3-D Motion Descriptor for Multi-View Human Action Recognition from 4-D Spatio-Temporal Interest Points , 2012, IEEE Journal of Selected Topics in Signal Processing.

[24]  Paul E. Anderson,et al.  Orthogonal Projections to Latent Structures , 2015 .

[25]  Joos Vandewalle,et al.  On the Best Rank-1 and Rank-(R1 , R2, ... , RN) Approximation of Higher-Order Tensors , 2000, SIAM J. Matrix Anal. Appl..

[26]  Liqing Zhang,et al.  Kernelization of Tensor-Based Models for Multiway Data Analysis: Processing of Multidimensional Structured Data , 2013, IEEE Signal Processing Magazine.

[27]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[28]  Naotaka Fujii,et al.  Long-Term Asynchronous Decoding of Arm Motion Using Electrocorticographic Signals in Monkeys , 2009, Front. Neuroeng..

[29]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[30]  Bernhard Schölkopf,et al.  A kernel view of the dimensionality reduction of manifolds , 2004, ICML.

[31]  Tonio Ball,et al.  Multilinear Subspace Regression: An Orthogonal Tensor Decomposition Approach , 2011, NIPS.

[32]  Changsheng Xu,et al.  Boosted Exemplar Learning for Action Recognition and Annotation , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[34]  Johan A. K. Suykens,et al.  A kernel-based framework to tensorial data analysis , 2011, Neural Networks.

[35]  R. Bro Multiway calibration. Multilinear PLS , 1996 .

[36]  Yasuo Nagasaka,et al.  Multidimensional Recording (MDR) and Data Sharing: An Ecological Open Research and Educational Platform for Neuroscience , 2011, PloS one.

[37]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[38]  Jean X. Gao,et al.  Probabilistic Partial Least Square Regression: A Robust Model for Quantitative Analysis of Raman Spectroscopy Data , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[39]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[40]  A. R. McIntosh,et al.  Spatiotemporal analysis of event-related fMRI data using partial least squares , 2004, NeuroImage.

[41]  Tae-Kyun Kim,et al.  Canonical Correlation Analysis of Video Volume Tensors for Action Categorization and Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[43]  Nuno Vasconcelos,et al.  A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[44]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[45]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[46]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[47]  Jammalamadaka Introduction to Linear Regression Analysis (3rd ed.) , 2003 .

[48]  Herman Wold,et al.  Soft modelling: The Basic Design and Some Extensions , 1982 .

[49]  Stefan Schaal,et al.  Locally Weighted Projection Regression: Incremental Real Time Learning in High Dimensional Space , 2000, ICML.

[50]  Kenji Fukumizu,et al.  Statistical Convergence of Kernel CCA , 2005, NIPS.

[51]  Naotaka Fujii,et al.  Higher Order Partial Least Squares (HOPLS): A Generalized Multilinear Regression Method , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[53]  Alfred O. Hero,et al.  Globally Sparse PLS Regression , 2013 .

[54]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[55]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[56]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[58]  Sheng Tang,et al.  Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[59]  Mubarak Shah,et al.  Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  David E. Booth,et al.  Multi-Way Analysis: Applications in the Chemical Sciences , 2005, Technometrics.

[61]  Antoni B. Chan,et al.  A Family of Probabilistic Kernels Based on Information Divergence , 2004 .

[62]  Dima Damen,et al.  Computer Vision and Pattern Recognition (CVPR) , 2009 .

[63]  Naotaka Fujii,et al.  Higher Order Partial Least Squares (HOPLS): A Generalized Multilinear Regression Method , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  John Shawe-Taylor,et al.  Efficient Sparse Kernel Feature Extraction Based on Partial Least Squares , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..