Bayesian Latent Semantic Analysis of Multimedia Databases

We present a Bayesian mixture model for probabilistic latent semantic analysis of documents with images and text. The Bayesian perspective allows us to perform automatic regularisation to obtain sparser and more coherent clustering models. It also enables us to encode a priori knowledge, such as word and image preferences. The learnt model can be used for browsing digital databases, information retrieval with image and/or text queries, image annotation (adding words to an image) and text illustration (adding images to a text).