Bayesian Sparse Partial Least Squares

Partial least squares (PLS) is a class of methods that makes use of a set of latent or unobserved variables to model the relation between (typically) two sets of input and output variables, respectively. Several flavors, depending on how the latent variables or components are computed, have been developed over the last years. In this letter, we propose a Bayesian formulation of PLS along with some extensions. In a nutshell, we provide sparsity at the input space level and an automatic estimation of the optimal number of latent components. We follow the variational approach to infer the parameter distributions. We have successfully tested the proposed methods on a synthetic data benchmark and on electrocorticogram data associated with several motor outputs in monkeys.

[1]  Tom Heskes,et al.  On the decoding of intracranial data using sparse orthonormalized partial least squares. , 2012, Journal of neural engineering.

[2]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[3]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[4]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[5]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[6]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[7]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[8]  Samuel Kaski,et al.  Bayesian CCA via Group Sparsity , 2011, ICML.

[9]  Chong Wang,et al.  Variational Bayesian Approach to Canonical Correlation Analysis , 2007, IEEE Transactions on Neural Networks.

[10]  J. Zidek,et al.  Adaptive Multivariate Ridge Regression , 1980 .

[11]  C. Goutis Partial least squares algorithm yields shrinkage estimators , 1996 .

[12]  S. Wold,et al.  The kernel algorithm for PLS , 1993 .

[13]  A. Izenman Reduced-rank regression for the multivariate linear model , 1975 .

[14]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[15]  Thomas E. Nichols,et al.  Functional connectomics from resting-state fMRI , 2013, Trends in Cognitive Sciences.

[16]  S. Keleş,et al.  Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[17]  Yukiyasu Kamitani,et al.  Estimating image bases for visual image reconstruction from human brain activity , 2009, NIPS.

[18]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[19]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[20]  Michael I. Jordan,et al.  Union support recovery in high-dimensional multivariate regression , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[21]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[22]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[23]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[24]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.

[25]  Adam J Rothman,et al.  Sparse Multivariate Regression With Covariance Estimation , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[26]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[27]  L. E. Wangen,et al.  A multiblock partial least squares algorithm for investigating complex chemical systems , 1989 .

[28]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[29]  M. Yuan,et al.  Dimension reduction and coefficient estimation in multivariate linear regression , 2007 .

[30]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[31]  Shinichi Nakajima,et al.  On Bayesian PCA: Automatic Dimensionality Selection and Analytic Solution , 2011, ICML.

[32]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[33]  Daoping Huang,et al.  Development of a Novel Adaptive Soft-Sensor Using Variational Bayesian PLS with Accounting for Online Identification of Key Variables , 2015 .