Matrix-Variate Dirichlet Process Priors with Applications

In this paper we propose a matrix-variate Dirichlet process (MATDP) for modeling the joint prior of a set of random matrices. Our approach is able to share statistical strength among regression coe cient matrices due to the clustering property of the Dirichlet process. Moreover, since the base probability measure is de ned as a matrix-variate distribution, the dependence among the elements of each random matrix is described via the matrixvariate distribution. We apply MATDP to multivariate supervised learning problems. In particular, we devise a nonparametric discriminative model and a nonparametric latent factor model. The interest is in considering correlations both across response variables (or covariates) and across response vectors. We derive MCMC algorithms for posterior inference and prediction, and illustrate the application of the models to multivariate regression, multi-class classi cation and multi-label prediction problems.

[1]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[2]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[3]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[4]  David B. Dunson,et al.  Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds , 2010, IEEE Transactions on Signal Processing.

[5]  Hal Daumé,et al.  Infinite Predictor Subspace Models for Multitask Learning , 2010, AISTATS.

[6]  Zhihua Zhang,et al.  Matrix-Variate Dirichlet Process Mixture Models , 2010, AISTATS.

[7]  S. MacEachern,et al.  Minimally informative prior distributions for non‐parametric Bayesian analysis , 2010 .

[8]  Hal Daumé,et al.  Multi-Label Prediction via Sparse Infinite CCA , 2009, NIPS.

[9]  Lawrence Carin,et al.  Nonparametric factor analysis with beta process priors , 2009, ICML '09.

[10]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[11]  M. West,et al.  High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics , 2008, Journal of the American Statistical Association.

[12]  L. Carin,et al.  The Matrix Stick-Breaking Process , 2008 .

[13]  C. Rasmussen,et al.  Dirichlet Process Mixtures of Factor Analysers , 2007 .

[14]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[15]  N. Pillai,et al.  Bayesian density regression , 2007 .

[16]  Hans-Peter Kriegel,et al.  Supervised probabilistic principal component analysis , 2006, KDD '06.

[17]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[18]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[19]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[20]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[21]  Yee Whye Teh,et al.  Semiparametric latent factor models , 2005, AISTATS.

[22]  Refik Soyer,et al.  Bayesian Methods for Nonlinear Classification and Regression , 2004, Technometrics.

[23]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[24]  Matthew West,et al.  Bayesian factor regression models in the''large p , 2003 .

[25]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[26]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[27]  D. Dittmar Slice Sampling , 2000 .

[28]  A. Rukhin Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.

[29]  Eric R. Ziegel,et al.  Practical Nonparametric and Semiparametric Bayesian Statistics , 1998, Technometrics.

[30]  R. J. Alcock Time-Series Similarity Queries Employing a Feature-Based Approach , 1999 .

[31]  Joseph G. Ibrahim,et al.  Semiparametric Bayesian Methods for Random Effects Models , 1998 .

[32]  Steven N. MacEachern,et al.  Computational Methods for Mixture of Dirichlet Process Models , 1998 .

[33]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[34]  Trevor Hastie,et al.  Predicting multivariate responses in multiple linear regression - Discussion , 1997 .

[35]  S. MacEachern,et al.  A semiparametric Bayesian model for randomised block designs , 1996 .

[36]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[37]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[38]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[39]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[40]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[41]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.