论文信息 - Hierarchical Bayesian Models for Applications in Information Retrieval

Hierarchical Bayesian Models for Applications in Information Retrieval

SUMMARY We present a simple hierarchical Bayesian approach to the modeling collections of texts and other large-scale data collections. For text collections, we posit that a document is generated by choosing a random set of multinomial probabilities for a set of possible “topics,” and then repeatedly generating words by sampling from the topic mixture. This model is intractable for exact probabilistic inference, but approximate posterior probabilities and marginal likelihoods can be obtained via fast variational methods. We also present extensions to coupled models for joint text/image data and multiresolution models for topic hierarchies.

Michael I. Jordan | A. Ng | D. Blei

[1] W. Bruce Croft,et al. Document clustering: An evaluation of some experiments with the cranfield 1400 collection , 1975, Inf. Process. Manag..

[2] J. Dickey. Multiple Hypergeometric Functions: Probabilistic Interpretations and Statistical Uses , 1983 .

[3] Frederick Mosteller,et al. Applied Bayesian And Classical Inference , 1984 .

[4] Frederick Mosteller,et al. Applied Bayesian and classical inference : the case of the Federalist papers , 1984 .

[5] Joseph B. Kadane,et al. Bayesian Methods for Censored Categorical Data , 1987 .

[6] G. Ronning. Maximum likelihood estimation of dirichlet distributions , 1989 .

[7] R. Kass,et al. Approximate Bayesian Inference in Conditionally Independent Hierarchical Models (Parametric Empirical Bayes Models) , 1989 .

[8] L. N. Kanal,et al. Uncertainty in Artificial Intelligence 5 , 1990 .

[9] Donna K. Harman,et al. Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[10] C. Robert,et al. Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[11] Frederick Jelinek,et al. Statistical methods for speech recognition , 1997 .