Generative or Discriminative? Getting the Best of Both Worlds

For many applications of machine learning the goal is to predict the value of a vector c given the value of a vector x of input features. In a classification problem c represents a discrete class label, whereas in a regression problem it corresponds to one or more continuous variables. From a probabilistic perspective, the goal is to find the conditional distribution p(c|x). The most common approach to this problem is to represent the conditional distribution using a parametric model, and then to determine the parameters using a training set consisting of pairs {xn, cn} of input vectors along with their corresponding target output vectors. The resulting conditional distribution can be used to make predictions of c for new values of x. This is known as a discriminative approach, since the conditional distribution discriminates directly between the different values of c.

[1]  B. D. Finetti La prévision : ses lois logiques, ses sources subjectives , 1937 .

[2]  R. Wolpert,et al.  Additional References in the Discussion , 1988 .

[3]  James M. Kasson,et al.  An analysis of selected computer interchange color spaces , 1992, TOGS.

[4]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[7]  Sadik Kapadia,et al.  Discriminative Training of Hidden Markov Models , 1998 .

[8]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[9]  Rajat Raina,et al.  Classification with Hybrid Generative/Discriminative Models , 2003, NIPS.

[10]  Michael I. Jordan,et al.  Hierarchical Bayesian Models for Applications in Information Retrieval , 2003 .

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Guillaume Bouchard,et al.  The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[13]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[14]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Pietro Perona,et al.  A discriminative framework for modelling object classes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Ilkay Ulusoy,et al.  Generative versus discriminative methods for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Vasant Honavar,et al.  Discriminatively trained Markov model for sequence classification , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[20]  T. Minka Discriminative models, not discriminative training , 2005 .

[21]  Tony Jebara,et al.  Machine learning: Discriminative and generative , 2006 .

[22]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.