Discriminative vs. Generative Learning of Bayesian Network Classifiers

Discriminative learning of Bayesian network classifiers has recently received considerable attention from the machine learning community. This interest has yielded several publications where new methods for the discriminative learning of both structure and parameters have been proposed. In this paper we present an empirical study used to illustrate how discriminative learning performs with respect to generative learning using simple Bayesian network classifiers such as naive Bayes or TAN, and we discuss when and why a discriminative learning is preferred. We also analyzed how log-likelihood and conditional log-likelihood scores guide the learning process of Bayesian network classifiers.

[1]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[2]  Pedro Larrañaga,et al.  Machine Learning : Editorial , 2005 .

[3]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[4]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[5]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[6]  S. Lauritzen,et al.  The TM algorithm for maximising a conditional likelihood function , 2001 .

[7]  A. J. Feelders,et al.  Discriminative Scoring of Bayesian Network Classifiers: a Comparative Study , 2006, Probabilistic Graphical Models.

[8]  Pedro Larrañaga,et al.  Information Theory and Classification Error in Probabilistic Classifiers , 2006, Discovery Science.

[9]  Henry Tirri,et al.  On Discriminative Bayesian Network Classifiers and Logistic Regression , 2005, Machine Learning.

[10]  Russell Greiner,et al.  Learning Bayesian Belief Network Classifiers: Algorithms and System , 2001, Canadian Conference on AI.

[11]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[12]  Russell Greiner,et al.  Discriminative Model Selection for Belief Net Structures , 2005, AAAI.

[13]  Pedro Larrañaga,et al.  Aprendizaje discriminativo de clasificadores Bayesianos , 2006 .

[14]  Pedro Larrañaga,et al.  Discriminative Learning of Bayesian Network Classifiers via the TM Algorithm , 2005, ECSQARU.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  Jeff A. Bilmes,et al.  Dynamic Bayesian Multinets , 2000, UAI.

[17]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Franz Pernkopf,et al.  Discriminative versus generative parameter and structure learning of Bayesian network classifiers , 2005, ICML.

[20]  Bin Shen,et al.  Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers , 2002, Machine Learning.

[21]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[22]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[23]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[24]  Trevor J. Hastie,et al.  Discriminative vs Informative Learning , 1997, KDD.