Scalable Node-Level Computation Kernels for Parallel Exact Inference

In this paper, we investigate data parallelism in exact inference with respect to arbitrary junction trees. Exact inference is a key problem in exploring probabilistic graphical models, where the computation complexity increases dramatically with clique width and the number of states of random variables. We study potential table representation and scalable algorithms for node-level primitives. Based on such node-level primitives, we propose computation kernels for evidence collection and evidence distribution. A data parallel algorithm for exact inference is presented using the proposed computation kernels. We analyze the scalability of node-level primitives, computation kernels, and the exact inference algorithm using the coarse-grained multicomputer (CGM) model. According to the analysis, we achieve O(Ndcwc Pij=1 wc rC,j/P) local computation time and O(N) global communication rounds using P processors, 1 les P les maxc PiPij1 wc rC,j, where N is the number of cliques in the junction tree; dc is the clique degree; rC,j is the number of states of the jth random variable in C; wc is the clique width; and ws is the separator width. We implemented the proposed algorithm on state-of-the-art clusters. Experimental results show that the proposed algorithm exhibits almost linear scalability over a wide range.

[1]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[2]  Yuanyuan Yang,et al.  Scheduling and performance analysis of multicast interconnects , 2007, The Journal of Supercomputing.

[3]  Yan Alexander Li,et al.  Minimizing the Application Execution Time Through Scheduling of Subtasks and Communication Traffic in a Heterogeneous Computing System , 1997, IEEE Trans. Parallel Distributed Syst..

[4]  Otto J. Anshus,et al.  Configurable Collective Communication in LAM-MPI , 2002 .

[5]  Viktor K. Prasanna,et al.  Node Level Primitives for Parallel Exact Inference , 2007 .

[6]  Debasish Ghose,et al.  Adaptive Load Distribution Strategies for Divisible Load Processing on Resource Unaware Multilevel Tree Networks , 2007, IEEE Transactions on Computers.

[7]  John R. Gilbert,et al.  An empirical study of the performance and productivity of two parallel programming models , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[8]  Richard J. Anderson,et al.  A comparison of shared and nonshared memory models of parallel computation , 1991 .

[9]  David M. Pennock Logarithmic Time Parallel Bayesian Inference , 1998, UAI.

[10]  Thomas L. Dean,et al.  Scalable Inference in Hierarchical Generative Models , 2006, AI&M.

[11]  Zhiwei Xu,et al.  Interaction Complexity - A Computational Complexity Measure for Service-Oriented Computing , 2005, 2005 First International Conference on Semantics, Knowledge and Grid.

[12]  A. George,et al.  Scheduling tradeoffs for heterogeneous computing on an advanced space processing platform , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[13]  David A. Bader High-Performance Algorithm Engineering for Large-Scale Graph Problems and Computational Biology , 2005, WEA.

[14]  Alan D. George,et al.  FASE: A Framework for Scalable Performance Prediction of HPC Systems and Applications , 2007, Simul..

[15]  Ross D. Shachter,et al.  Global Conditioning for Probabilistic Inference in Belief Networks , 1994, UAI.

[16]  David A. Bader Petascale Computing for Large-Scale Graph Problems , 2007, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.

[17]  Eric Horvitz,et al.  Probabilistic Diagnosis Using a Reformulation of the INTERNIST-1/QMR Knowledge Base Part II , 2016 .

[18]  Adnan Darwiche,et al.  Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..

[19]  Viktor K. Prasanna,et al.  Scalable parallel implementation of exact inference in Bayesian networks , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[20]  Viktor K. Prasanna,et al.  Scalable Parallel Implementation of Bayesian Network to Junction Tree Conversion for Exact Inference , 2006, 2006 18th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'06).

[21]  Albert Y. Zomaya,et al.  Single-row mapping and transformation of connected graphs , 2006, The Journal of Supercomputing.

[22]  Kai Lu,et al.  On the performance-driven load distribution for heterogeneous computational grids , 2007, J. Comput. Syst. Sci..

[23]  Cho-Li Wang,et al.  A segment-based DSM supporting large shared object space , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[24]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[25]  Albert Chan,et al.  CGMGRAPH/CGMLIB: Implementing and Testing CGM Graph Algorithms on PC Clusters and Shared Memory Machines , 2005, Int. J. High Perform. Comput. Appl..

[26]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[27]  Jaswinder Pal Singh,et al.  A parallel Lauritzen-Spiegelhalter algorithm for probabilistic inference , 1994, Proceedings of Supercomputing '94.

[28]  Ben Taskar,et al.  Rich probabilistic models for gene expression , 2001, ISMB.

[29]  Leslie G. Valiant,et al.  General Purpose Parallel Architectures , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[30]  Albert Chan,et al.  Coarse grained parallel algorithms for graph matching , 2008, Parallel Comput..