Fast State Discovery for HMM Model Selection and Learning

Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new state-splitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous top-down methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the state-splitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several real-world data sets from the literature, revealing significant improvements in efficiency and test-set likelihoods. We also compare to previous algorithms on a sign-language recognition task, with positive results.

[1]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Andreas Stolcke,et al.  Best-first Model Merging for Hidden Markov Model Induction , 1994, ArXiv.

[5]  D. Haussler,et al.  A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.

[6]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[7]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[8]  Mari Ostendorf,et al.  HMM topology design using maximum likelihood successive state splitting , 1997, Comput. Speech Lang..

[9]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Roni Rosenfeld,et al.  Learning Hidden Markov Model Structure for Information Extraction , 1999 .

[12]  Gautam Biswas,et al.  Temporal Pattern Generation Using Hidden Markov Model Based Unsupervised Classification , 1999, IDA.

[13]  Nir Friedman,et al.  Learning the Dimensionality of Hidden Variables , 2001, UAI.

[14]  Mohammed Waleed Kadous,et al.  Temporal classification: extending the classification paradigm to multivariate time series , 2002 .

[15]  Jon M. Kleinberg,et al.  Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis , 2003, NIPS.

[16]  D. Morales,et al.  Towards Removing Artificial Landmarks for Autonomous Exploration in Structured Environments , 2004 .

[17]  James M. Rehg,et al.  A data-driven approach to quantifying natural human motion , 2005, SIGGRAPH '05.

[18]  Andrew W. Moore,et al.  Fast inference and learning in large-state-space HMMs , 2005, ICML '05.

[19]  James M. Rehg,et al.  A data-driven approach to quantifying natural human motion , 2005, ACM Trans. Graph..

[20]  Andrew Zisserman,et al.  Advances in Neural Information Processing Systems (NIPS) , 2007 .