Data Parallelism for Belief Propagation in Factor Graphs

We investigate data parallelism for belief propagation in a cyclic factor graphs on multicore/many core processors. Belief propagation is a key problem in exploring factor graphs, a probabilistic graphical model that has found applications in many domains. In this paper, we identify basic operations called node level primitives for updating the distribution tables in a factor graph. We develop algorithms for these primitives to explore data parallelism. We also propose a complete belief propagation algorithm to perform exact inference in such graphs. We implement the proposed algorithms on state-of-the-art multicore processors and show that the proposed algorithms exhibit good scalability using a representative set of factor graphs. On a 32-core Intel Nehalem-EX based system, we achieve 30× speedup for the primitives and 29× for the complete algorithm using factor graphs with large distribution tables.

[1]  Joris M. Mooij,et al.  libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models , 2010, J. Mach. Learn. Res..

[2]  Viktor K. Prasanna,et al.  Parallel evidence propagation on multicore processors , 2009, The Journal of Supercomputing.

[3]  Fernando Pereira,et al.  Distributed MAP Inference for Undirected Graphical Models , 2010 .

[4]  Manish Parashar,et al.  Advanced Computational Infrastructures for Parallel and Distributed Adaptive Applications , 2009 .

[5]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[6]  H.-A. Loeliger,et al.  An introduction to factor graphs , 2004, IEEE Signal Processing Magazine.

[7]  Endika Bengoetxea,et al.  A parallel framework for loopy belief propagation , 2007, GECCO '07.

[8]  Rüdiger L. Urbanke,et al.  The capacity of low-density parity-check codes under message-passing decoding , 2001, IEEE Trans. Inf. Theory.

[9]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[10]  Leonel Sousa,et al.  Massively LDPC Decoding on Multicore Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.

[11]  William T. Freeman,et al.  Signal and Image Processing with Belief Propagation , 2008 .

[12]  Katharina Morik,et al.  Parallel Inference on Structured Data with CRFs on GPUs , 2012 .

[13]  Viktor K. Prasanna,et al.  Parallel Evidence Propagation on Multicore Processors , 2009, PaCT.

[14]  Xiaolin Li,et al.  Advanced Computational Infrastructures for Parallel and Distributed Applications , 2009 .

[15]  Jeffrey S. Vetter,et al.  Maestro: Data Orchestration and Tuning for OpenCL Devices , 2010, Euro-Par.

[16]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[17]  Christian Schlegel,et al.  Trellis and turbo coding , 2004 .

[18]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[19]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[20]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[21]  David R. O'Hallaron,et al.  Distributed Parallel Inference on Large Factor Graphs , 2009, UAI.

[22]  Viktor K. Prasanna,et al.  Scalable Node-Level Computation Kernels for Parallel Exact Inference , 2010, IEEE Transactions on Computers.

[23]  Tarek A. El-Ghazawi,et al.  An evaluation of global address space languages: co-array fortran and unified parallel C , 2005, PPoPP.

[24]  Erik Sudderth,et al.  Signal and Image Processing with Belief Propagation [DSP Applications] , 2008, IEEE Signal Processing Magazine.

[25]  Rama Chellappa,et al.  Scalable data parallel algorithms for texture synthesis using Gibbs random fields , 1995, IEEE Trans. Image Process..

[26]  Jaswinder Pal Singh,et al.  A parallel Lauritzen-Spiegelhalter algorithm for probabilistic inference , 1994, Proceedings of Supercomputing '94.

[27]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[28]  Vladimir Pavlovic,et al.  Integrative Protein Function Transfer Using Factor Graphs and Heterogeneous Data Sources , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.

[29]  Rama Chellappa,et al.  Scalable Data Parallel Algorithms for Texture Synthesis and Compression using Gibbs Random Fields , 1998 .

[30]  Rajkumar Buyya,et al.  High Performance Cluster Computing: Architectures and Systems , 1999 .

[31]  Joseph R. Cavallaro,et al.  Semi-parallel reconfigurable architectures for real-time LDPC decoding , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[32]  Jack Dongarra,et al.  Scientific Computing with Multicore and Accelerators , 2010, Chapman and Hall / CRC computational science series.

[33]  Laxmikant V. Kale,et al.  Accelerator Support in the Charm++ Parallel Programming Model. , 2010 .

[34]  Paul Douglas,et al.  International Conference on Information Technology : Coding and Computing , 2003 .