Bayesian Models for Massive Multimedia Databases: a New Frontier

Modelling the increasing number of digital databases (the web, photo-libraries, music collections, news archives, medical databases) is one of the greatest challenges of statisticians in the new century. Despite the larg e amounts of data, the models are so large that they motivate the use of Bayesian models. In particular, the Bayesian perspective allows us to perform automatic regularisation to obtain sparse and coherent models. It also enables us to encode a priori knowledge, such as word, music and image preferences. The learned models can be used for browsing digital databases, information retrieval with image, m usic and/or text queries, image annotation (adding words to an image), text illustrat ion (adding images to a text), and object recognition.

[1]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[2]  David A. Forsyth,et al.  Clustering art , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[4]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[5]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[6]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[7]  A. Narayanan Maximum Likelihood Estimation of the Parameters of the Dirichlet Distribution , 1991 .

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[10]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[11]  I. Good Good Thinking: The Foundations of Probability and Its Applications , 1983 .

[12]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[13]  Nando de Freitas,et al.  Bayesian Latent Semantic Analysis of Multimedia Databases , 2001 .

[14]  Holger H. Hoos,et al.  GUIDO/MIR - an Experimental Musical Information Retrieval System based on GUIDO Music Notation , 2001, ISMIR.