Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior's covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions which allow non-linear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be further kernelised leading to 'twin kernel PCA' in which a mapping between feature spaces occurs.

[1]  R. H. Myers Classical and modern regression with applications , 1986 .

[2]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[3]  Stefan Schaal,et al.  Local Dimensionality Reduction , 1997, NIPS.

[4]  Lennart Ljung,et al.  Theory and Practice of Recursive Identification , 1983 .

[5]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[6]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[7]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[8]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[9]  Christopher G. Atkeson,et al.  Model-Based Control of a Robot Manipulator , 1988 .

[10]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[11]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[12]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[13]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[14]  T. Hastie,et al.  Local Regression: Automatic Kernel Carpentry , 1993 .

[15]  Frank L. Lewis,et al.  Aircraft Control and Simulation , 1992 .

[16]  Geoffrey E. Hinton,et al.  Global Coordination of Local Linear Models , 2001, NIPS.

[17]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[18]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[19]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[20]  Hidemitsu Ogawa,et al.  RKHS-based functional analysis for exact incremental learning , 1999, Neurocomputing.

[21]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[22]  William H. Press,et al.  Numerical recipes in C , 2002 .

[23]  Stefan Schaal,et al.  Are internal models of the entire body learnable , 2001 .

[24]  Christopher M. Bishop,et al.  GTM: A Principled Alternative to the Self-Organizing Map , 1996, NIPS.

[25]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[26]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[27]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .

[28]  Michael E. Tipping Sparse Kernel Principal Component Analysis , 2000, NIPS.

[29]  Christopher G. Atkeson,et al.  Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[30]  Stefan Schaal,et al.  Assessing the Quality of Learned Local Models , 1993, NIPS.

[31]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[32]  Jun Nakanishi,et al.  Composite adaptive control with locally weighted statistical learning , 2005, Neural Networks.

[33]  Ben J. A. Kröse,et al.  Supervised Dimension Reduction of Intrinsically Low-Dimensional Data , 2002, Neural Computation.

[34]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[35]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[36]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[37]  R. Casey,et al.  Advances in Pattern Recognition , 1971 .

[38]  Stefan Schaal,et al.  Local Adaptive Subspace Regression , 1998, Neural Processing Letters.

[39]  Geoffrey E. Hinton,et al.  SMEM Algorithm for Mixture Models , 1998, Neural Computation.

[40]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[41]  C. Bishop,et al.  Analysis of multiphase flows using dual-energy gamma densitometry and neural networks , 1993 .

[42]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .