Augmented Semi-naive Bayes Classifier

The naive Bayes is a competitive classifier that makes strong conditional independence assumptions. Its accuracy can be improved by relaxing these assumptions. One classifier which does that is the semi-naive Bayes. The state-of-the-art algorithm for learning a semi-naive Bayes from data is the backward sequential elimination and joining (BSEJ) algorithm. We extend BSEJ with a second step which removes some of its unwarranted independence assumptions. Our classifier outperforms BSEJ and five other Bayesian network classifiers on a set of benchmark databases, although the difference in performance is not statistically significant.

[1]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[2]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[3]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[4]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[5]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[6]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[7]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[8]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[9]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[10]  Concha Bielza,et al.  bayesclass A Package for Learning Bayesian Network Classiers , 2013 .

[11]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[12]  Pedro Larrañaga,et al.  Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS , 2005, J. Biomed. Informatics.

[13]  Dimitris Margaritis,et al.  Speculative Markov blanket discovery for optimal feature selection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[14]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[15]  M. Pazzani Constructive Induction of Cartesian Product Attributes , 1998 .

[16]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.