The Infinite PCFG Using Hierarchical Dirichlet Processes

We present a nonparametric Bayesian model of tree structures based on the hierarchical Dirichlet process (HDP). Our HDP-PCFG model allows the complexity of the grammar to grow as more training data is available. In addition to presenting a fully Bayesian model for the PCFG, we also develop an efficient variational inference procedure. On synthetic data, we recover the correct grammar without having to specify its complexity in advance. We also show that our techniques can be applied to full-scale parsing applications by demonstrating its effectiveness in learning state-split grammars.

[1]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[3]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[4]  Rafael C. Carrasco,et al.  Grammatical Inference and Applications , 1994, Lecture Notes in Computer Science.

[5]  Andreas Stolcke,et al.  Inducing Probabilistic Grammars by Bayesian Model Merging , 1994, ICGI.

[6]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[7]  Eugene Charniak,et al.  Tree-Bank Grammars , 1996, AAAI/IAAI, Vol. 2.

[8]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[9]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[10]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[11]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[12]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[13]  Kenichi Kurihara,et al.  An Application of the Variational Bayesian Approach to Probabilistic Context-Free Grammars , 2004 .

[14]  Jun'ichi Tsujii,et al.  Probabilistic CFG with Latent Annotations , 2005, ACL.

[15]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[16]  Thomas L. Griffiths,et al.  Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.

[17]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[18]  Thomas L. Griffiths,et al.  Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models , 2006, NIPS.

[19]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[20]  Kenichi Kurihara,et al.  Variational Bayesian Grammar Induction for Natural Language , 2006, ICGI.

[21]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[22]  Christopher D. Manning,et al.  The Infinite Tree , 2007, ACL.

[23]  Dan Klein,et al.  Learning and Inference for Hierarchically Split PCFGs , 2007, AAAI.

[24]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .