Wait-Free Primitives for Initializing Bayesian Network Structure Learning on Multicore Processors

Structure learning is a key problem in using Bayesian networks for data mining tasks but its computation complexity increases dramatically with the number of features in the dataset. Thus, it is computationally intractable to extend structure learning to large networks without using a scalable parallel approach. This work explores computation primitives to parallelize the first phase of Cheng et al.'s (Artificial Intelligence, 137(1-2):43-90, 2002) Bayesian network structure learning algorithm. The proposed primitives are highly suitable for multithreading architectures. Firstly, we propose a wait-free table construction primitive for building potential tables from the training data in parallel. Notably, this primitive allows multiple cores to update a potential table simultaneously without appealing to any lock operation, allowing all cores to be fully utilized. Secondly, the marginalization primitive is proposed to enable efficient statistics tests to be performed on all pairs of variables in the learning algorithm. These primitives are quantitatively evaluated on a 32-core platform and the experiment results show 23:5× speedup compared to a single thread implementation.

[1]  Barton P. Miller,et al.  What are race conditions?: Some issues and formalizations , 1992, LOPL.

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3]  David R. O'Hallaron,et al.  Distributed Parallel Inference on Large Factor Graphs , 2009, UAI.

[4]  Joseph Gonzalez,et al.  Residual Splash for Optimally Parallelizing Belief Propagation , 2009, AISTATS.

[5]  Laura E. Brown,et al.  Scaling-Up Bayesian Network Learning to Thousands of Variables Using Local Learning Techniques , 2003 .

[6]  Vincent Frouin,et al.  Evolutionary approaches for the reverse-engineering of gene regulatory networks: A study on a biologically realistic dataset , 2008, BMC Bioinformatics.

[7]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[8]  Viktor K. Prasanna,et al.  Junction tree decomposition for parallel exact inference , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[9]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[10]  Viktor K. Prasanna,et al.  Scalable parallel implementation of exact inference in Bayesian networks , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[11]  Srinivas Aluru,et al.  Parallel globally optimal structure learning of Bayesian networks , 2013, J. Parallel Distributed Comput..

[12]  Zheng Rong Yang,et al.  Machine Learning Approaches to Bioinformatics , 2010, Science, Engineering, and Biology Informatics.

[13]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[14]  Andrew W. Moore,et al.  Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning , 2003, ICML.

[15]  Daphne Koller,et al.  Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks , 2005, UAI.

[16]  Satoru Miyano,et al.  Finding Optimal Models for Small Gene Networks , 2003 .

[17]  David M. Pennock Logarithmic Time Parallel Bayesian Inference , 1998, UAI.

[18]  Viktor K. Prasanna,et al.  Scalable Node-Level Computation Kernels for Parallel Exact Inference , 2010, IEEE Transactions on Computers.

[19]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[20]  Satoru Miyano,et al.  Parallel Algorithm for Learning Optimal Bayesian Network Structure , 2011, J. Mach. Learn. Res..

[21]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[22]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[23]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[24]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[25]  Viktor K. Prasanna,et al.  Distributed Evidence Propagation in Junction Trees on Clusters , 2012, IEEE Transactions on Parallel and Distributed Systems.

[26]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[27]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.