Asymptotic Efficiency of Deterministic Estimators for Discrete Energy-Based Models: Ratio Matching and Pseudolikelihood

Standard maximum likelihood estimation cannot be applied to discrete energy-based models in the general case because the computation of exact model probabilities is intractable. Recent research has seen the proposal of several new estimators designed specifically to overcome this intractability, but virtually nothing is known about their theoretical properties. In this paper, we present a generalized estimator that unifies many of the classical and recently proposed estimators. We use results from the standard asymptotic theory for M-estimators to derive a generic expression for the asymptotic covariance matrix of our generalized estimator. We apply these results to study the relative statistical efficiency of classical pseudolikelihood and the recently-proposed ratio matching estimator.

[1]  R. Fisher,et al.  On the Mathematical Foundations of Theoretical Statistics , 1922 .

[2]  E. Ising Beitrag zur Theorie des Ferromagnetismus , 1925 .

[3]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[4]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[5]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[6]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[8]  L. Younes Parametric Inference for imperfectly observed Gibbsian fields , 1989 .

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Martin J. Wainwright,et al.  Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching , 2003, AISTATS.

[11]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[12]  Noah A. Smith,et al.  Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[13]  Aapo Hyvärinen,et al.  Some extensions of score matching , 2007, Comput. Stat. Data Anal..

[14]  Michael I. Jordan,et al.  An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators , 2008, ICML '08.

[15]  Siwei Lyu,et al.  Interpretation and Generalization of Score Matching , 2009, UAI.

[16]  Nando de Freitas,et al.  Inductive Principles for Restricted Boltzmann Machine Learning , 2010, AISTATS.

[17]  Daphne Koller,et al.  Non-Local Contrastive Objectives , 2010, ICML.

[18]  Harry Joe,et al.  Composite Likelihood Methods , 2012 .