Chained Gaussian Processes

Gaussian process models are flexible, Bayesian non-parametric approaches to regression. Properties of multivariate Gaussians mean that they can be combined linearly in the manner of additive models and via a link function (like in generalized linear models) to handle non-Gaussian data. However, the link function formalism is restrictive, link functions are always invertible and must convert a parameter of interest to a linear combination of the underlying processes. There are many likelihoods and models where a non-linear combination is more appropriate. We term these more general models Chained Gaussian Processes: the transformation of the GPs to the likelihood parameters will not generally be invertible, and that implies that linearisation would only be possible with multiple (localized) links, i.e. a chain. We develop an approximate inference procedure for Chained GPs that is scalable and applicable to any factorized likelihood. We demonstrate the approximation on a range of likelihood functions.

[1]  Carl E. Rasmussen,et al.  Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models , 2014, NIPS.

[2]  Edwin V. Bonilla,et al.  Automated Variational Inference for Gaussian Process Models , 2014, NIPS.

[3]  Edwin V. Bonilla,et al.  Scalable Inference for Gaussian Process Models with Black-Box Likelihoods , 2015, NIPS.

[4]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[5]  Aki Vehtari,et al.  Expectation propagation for nonstationary heteroscedastic Gaussian process regression , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[6]  David B. Dunson,et al.  Bayesian data analysis, third edition , 2013 .

[7]  Daan Wierstra,et al.  Stochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models , 2014, ArXiv.

[8]  Aki Vehtari,et al.  Robust Gaussian Process Regression with a Student-t Likelihood , 2011, J. Mach. Learn. Res..

[9]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[10]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[11]  Christoph H. Lampert,et al.  Mind the Nuisance: Gaussian Process Classification using Privileged Noise , 2014, NIPS.

[12]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[13]  James Hensman,et al.  Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[14]  Ryan P. Adams,et al.  Gaussian process product models for nonparametric nonstationarity , 2008, ICML '08.

[15]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[16]  Diederik P. Kingma,et al.  Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .

[17]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[18]  Malte Kuß,et al.  Gaussian process models for robust regression, classification, and reinforcement learning , 2006 .

[19]  Neil D. Lawrence,et al.  Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters , 2013, BMC Bioinformatics.

[20]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[21]  Scott W. Linderman,et al.  Discovering Latent Network Structure in Point Process Data , 2014, ICML.

[22]  R. Henderson,et al.  Modeling Spatial Variation in Leukemia Survival Data , 2002 .

[23]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[24]  Aki Vehtari,et al.  GPstuff: Bayesian modeling with Gaussian processes , 2013, J. Mach. Learn. Res..

[25]  Zhiliang Ying,et al.  Semiparametric Analysis of General Additive-Multiplicative Hazard Models for Counting Processes , 1995 .

[26]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[27]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[28]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[29]  M. Sahani,et al.  Demodulation as Probabilistic Inference , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[30]  Paul W. Goldberg,et al.  Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[31]  Carl E. Rasmussen,et al.  Warped Gaussian Processes , 2003, NIPS.