Abstract. Bayesian classification and regression with high order interactions is largely infeasible because Markov chain Monte Carlo (MCMC) would need to be applied with a great many parameters, whose number increases rapidly with the order. In this paper we show how to make it feasible by effectively reducing the number of parameters, exploiting the fact that many interactions have the same values for all training cases. Our method uses a single “compressed” parameter to represent the sum of all parameters associated with a set of patterns that have the same value for all training cases. Using symmetric stable distributions as the priors of the original parameters, we can easily find the priors of these compressed parameters. We therefore need to deal only with a much smaller number of compressed parameters when training the model with MCMC. The number of compressed parameters may have converged before considering the highest possible order. After training the model, we can split these compressed parameters into the original ones as needed to make predictions for test cases. We show in detail how to compress parameters for logistic sequence prediction models. Experiments on both simulated and real data demonstrate that a huge number of parameters can indeed be reduced by our compression method.
[1]
Feller William,et al.
An Introduction To Probability Theory And Its Applications
,
1950
.
[2]
J. Baker,et al.
The DRAGON system--An overview
,
1975
.
[3]
Ian H. Witten,et al.
Text Compression
,
1990,
125 Problems in Text Algorithms.
[4]
Justin K. Romberg,et al.
Bayesian tree-structured image modeling
,
2000,
4th IEEE Southwest Symposium on Image Analysis and Interpretation.
[5]
Justin K. Romberg,et al.
Bayesian tree-structured image modeling using wavelet-domain hidden Markov models
,
2001,
IEEE Trans. Image Process..
[6]
J. H. Moore,et al.
Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer.
,
2001,
American journal of human genetics.
[7]
Radford M. Neal.
Slice Sampling
,
2003,
The Annals of Statistics.
[8]
Longhai Li,et al.
BAYESIAN CLASSIFICATION AND REGRESSSION WITH HIGH DIMENSIONAL FEATURES
,
2007,
0709.2936.
[9]
Shuying Sun,et al.
Haplotype Inference Using a Hidden Markov Model with E-cient Markov Chain Sampling
,
2007
.