Principled Hybrids of Generative and Discriminative Models

When labelled training data is plentiful, discriminative techniques are widely used since they give excellent generalization performance. However, for large-scale applications such as object recognition, hand labelling of data is expensive, and there is much interest in semi-supervised techniques based on generative models in which the majority of the training data is unlabelled. Although the generalization performance of generative models can often be improved by ‘training them discriminatively’, they can then no longer make use of unlabelled data. In an attempt to gain the benefit of both generative and discriminative approaches, heuristic procedure have been proposed [2, 3] which interpolate between these two extremes by taking a convex combination of the generative and discriminative objective functions. In this paper we adopt a new perspective which says that there is only one correct way to train a given model, and that a ‘discriminatively trained’ generative model is fundamentally a new model [7]. From this viewpoint, generative and discriminative models correspond to specific choices for the prior over parameters. As well as giving a principled interpretation of ‘discriminative training’, this approach opens door to very general ways of interpolating between generative and discriminative extremes through alternative choices of prior. We illustrate this framework using both synthetic data and a practical example in the domain of multi-class object recognition. Our results show that, when the supply of labelled training data is limited, the optimum performance corresponds to a balance between the purely generative and the purely discriminative.

[1]  James M. Kasson,et al.  An analysis of selected computer interchange color spaces , 1992, TOGS.

[2]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[3]  Sadik Kapadia,et al.  Discriminative Training of Hidden Markov Models , 1998 .

[4]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[5]  Tony Jebara,et al.  Machine Learning: Discriminative and Generative (Kluwer International Series in Engineering and Computer Science) , 2003 .

[6]  Guillaume Bouchard,et al.  The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[7]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[8]  Pietro Perona,et al.  A discriminative framework for modelling object classes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Ilkay Ulusoy,et al.  Generative versus discriminative methods for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Vasant Honavar,et al.  Discriminatively trained Markov model for sequence classification , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[11]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2005, International Journal of Computer Vision.

[12]  T. Minka Discriminative models, not discriminative training , 2005 .

[13]  Tony Jebara,et al.  Machine learning: Discriminative and generative , 2006 .