Probabilistic inference in graphical models

1 INTRODUCTION A " graphical model " is a type of probabilistic network that has roots in several different framework provides a clean mathematical formalism that has made it possible to understand the relationships among a wide variety of network-based approaches to computation, and in particular to understand many neural network algorithms and architectures as instances of a broader probabilistic methodology. Graphical models use graphs to represent and manipulate joint probability distributions. The graph underlying a graphical model may be directed, in which case the model is often referred to as a belief network or a Bayesian network (see BAYESIAN NETWORKS), or the graph may be undirected, in which case the model is generally referred to as a Markov random field. A graphical model has both a structural component—encoded by the pattern of edges in the graph—and a parametric component—encoded by numerical " potentials " associated with sets of edges in the graph. The relationship between these components underlies the computational machinery associated with graphical models. In particular, general inference algorithms allow statistical quantities (such as likelihoods and conditional probabilities) and information-theoretic quantities (such as mutual information and conditional entropies) to be computed efficiently. These algorithms are the subject of the current article. Learning algorithms build on these inference algorithms and allow parameters and structures to be estimated from data (see GRAPHICAL MODELS, PARAMETER LEARNING and GRAPHICAL MODELS, STRUCTURE LEARNING).

[1]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[2]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[3]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[4]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[5]  Prakash P. Shenoy,et al.  Probability propagation , 1990, Annals of Mathematics and Artificial Intelligence.

[6]  W. Freeman,et al.  Bethe free energy, Kikuchi approximations, and belief propagation algorithms , 2001 .

[7]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[8]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[9]  Michael I. Jordan Graphical Models , 1998 .

[10]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[11]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[12]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[13]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..