Learning Bayesian network classifiers with completed partially directed acyclic graphs

Most search and score algorithms for learning Bayesian network classifiers from data traverse the space of directed acyclic graphs (DAGs), making arbitrary yet possibly suboptimal arc directionality decisions. This can be remedied by learning in the space of DAG equivalence classes. We provide a number of contributions to existing work along this line. First, we identify the smallest subspace of DAGs that covers all possible class-posterior distributions when data is complete. All the DAGs in this space, which we call minimal class-focused DAGs (MC-DAGs), are such that their every arc is directed towards a child of the class variable. Second, in order to traverse the equivalence classes of MC-DAGs, we adapt the greedy equivalence search (GES) by adding operator validity criteria which ensure GES only visits states within our space. Third, we specify how to efficiently evaluate the discriminative score of a GES operator for MC-DAG in time independent of the number of variables and without converting the completed partially DAG, which represents an equivalence class, into a DAG. The adapted GES perfomed well on real-world data sets.

[1]  Luis M. de Campos,et al.  Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs , 2005, Machine Learning.

[2]  Geoffrey I. Webb,et al.  Alleviating naive Bayes attribute independence assumption by attribute weighting , 2013, J. Mach. Learn. Res..

[3]  Mihaljevic Bojan,et al.  Learning Discrete Bayesian Network Classifiers from Data , 2015 .

[4]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[5]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[6]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[7]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[8]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[9]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[10]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[11]  Luis M. de Campos,et al.  Searching for Bayesian Network Structures in the Space of Restricted Acyclic Partially Directed Graphs , 2011, J. Artif. Intell. Res..

[12]  Concha Bielza,et al.  Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[13]  Eamonn J. Keogh,et al.  Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.

[14]  M. Pazzani Constructive Induction of Cartesian Product Attributes , 1998 .

[15]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..