Compressing Parameters in Bayesian High-order Models with Application to Logistic Sequence Models ∗

Bayesian classification and regression with high-order interactions is largely infeasible be- cause Markov chain Monte Carlo (MCMC) would need to be applied with a great many parameters, whose number increases rapidly with the order. In this paper we show how to make it feasible by effectively reducing the number of parameters, exploiting the fact that many interactions have the same values for all training cases. Our method uses a single "compressed" parameter to represent the sum of all parameters associated with a set of patterns that have the same value for all training cases. Using symmetric stable distributions as the priors of the original parameters, we can easily find the priors of these compressed parameters. We therefore need to deal only with a much smaller number of com- pressed parameters when training the model with MCMC. After training the model, we can split these compressed parameters into the original ones as needed to make predictions for test cases. We show in detail how to compress parameters for logistic sequence prediction models. Experiments on both simulated and real data demonstrate that a huge number of parameters can indeed be reduced by our compression method.

[1]  Longhai Li,et al.  BAYESIAN CLASSIFICATION AND REGRESSSION WITH HIGH DIMENSIONAL FEATURES , 2007, 0709.2936.

[2]  Justin K. Romberg,et al.  Bayesian tree-structured image modeling , 2000, 4th IEEE Southwest Symposium on Image Analysis and Interpretation.

[3]  Frank E. Grubbs,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[4]  Paolo Ferragina,et al.  Text Compression , 2009, Encyclopedia of Database Systems.

[5]  Anne Lohrli Chapman and Hall , 1985 .

[6]  Justin K. Romberg,et al.  Bayesian tree-structured image modeling using wavelet-domain hidden Markov models , 2001, IEEE Trans. Image Process..

[7]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[8]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[9]  Dale J. Poirier,et al.  REVISING BELIEFS IN NONIDENTIFIED MODELS , 1998, Econometric Theory.

[10]  Shuying Sun,et al.  Haplotype Inference Using a Hidden Markov Model with E-cient Markov Chain Sampling , 2007 .

[11]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[12]  Sewall Wright,et al.  GENIC AND ORGANISMIC SELECTION , 1980, Evolution; international journal of organic evolution.

[13]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[14]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[15]  D. V. Lindley,et al.  An Introduction to Probability Theory and Its Applications. Volume II , 1967, The Mathematical Gazette.

[16]  J. Cheverud,et al.  Epistasis and its contribution to genetic variance components. , 1995, Genetics.

[17]  J. Baker,et al.  The DRAGON system--An overview , 1975 .

[18]  E. Barankin Sufficient parameters: Solution of the minimal dimensionality problem , 1960 .

[19]  Ronald A. Thisted,et al.  Elements of statistical computing , 1986 .

[20]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.