Hierarchical Bayesian Models for Applications in Information Retrieval

SUMMARY We present a simple hierarchical Bayesian approach to the modeling collections of texts and other large-scale data collections. For text collections, we posit that a document is generated by choosing a random set of multinomial probabilities for a set of possible “topics,” and then repeatedly generating words by sampling from the topic mixture. This model is intractable for exact probabilistic inference, but approximate posterior probabilities and marginal likelihoods can be obtained via fast variational methods. We also present extensions to coupled models for joint text/image data and multiresolution models for topic hierarchies.

[1]  W. Bruce Croft,et al.  Document clustering: An evaluation of some experiments with the cranfield 1400 collection , 1975, Inf. Process. Manag..

[2]  J. Dickey Multiple Hypergeometric Functions: Probabilistic Interpretations and Statistical Uses , 1983 .

[3]  Frederick Mosteller,et al.  Applied Bayesian And Classical Inference , 1984 .

[4]  Frederick Mosteller,et al.  Applied Bayesian and classical inference : the case of the Federalist papers , 1984 .

[5]  Joseph B. Kadane,et al.  Bayesian Methods for Censored Categorical Data , 1987 .

[6]  G. Ronning Maximum likelihood estimation of dirichlet distributions , 1989 .

[7]  R. Kass,et al.  Approximate Bayesian Inference in Conditionally Independent Hierarchical Models (Parametric Empirical Bayes Models) , 1989 .

[8]  L. N. Kanal,et al.  Uncertainty in Artificial Intelligence 5 , 1990 .

[9]  Donna K. Harman,et al.  Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[10]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[11]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[12]  T. Louis,et al.  Bayes and Empirical Bayes Methods for Data Analysis. , 1997 .

[13]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[14]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[15]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[16]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[17]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[18]  J. Ibrahim,et al.  Power prior distributions for regression models , 2000 .

[19]  Nando de Freitas,et al.  Variational MCMC , 2001, UAI.

[20]  P. Green,et al.  Modelling Heterogeneity With and Without the Dirichlet Process , 2001 .

[21]  Hilbert J. Kappen,et al.  General Lower Bounds based on Computer Generated Higher Order Expansions , 2012, UAI.

[22]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[23]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.