论文信息 - GPU-Accelerated Parameter Optimization for Classification Rule Learning

GPU-Accelerated Parameter Optimization for Classification Rule Learning

While some studies comparing rule-based classifiers enumerate a parameter over several values, most use all default values, presumably due to the high computational cost of jointly tuning multiple parameters. We show that thorough, joint optimization of search parameters on individual datasets gives higher out-ofsample precision than fixed baselines. We test on 1,000 relatively large synthetic datasets with widely-varying properties. We optimize heuristic beam search with the m-estimate interestingness measure. We jointly tune m, the beam size, and the maximum rule length. The beam size controls the extent of search, where oversearching can find spurious rules. m controls the bias toward higher-frequency rules, with the optimal value depending on the amount of noise in the dataset. We assert that such hyper-parameters affecting the frequency bias and extent of search should be optimized simultaneously, since both directly affect the false-discovery rate. While our method based on grid search and crossvalidation is computationally intensive, we show that it can be massively parallelized, with our GPU implementation providing up to 28x speedup over a comparable multi-threaded CPU implementation.

Viktor K. Prasanna | Anand V. Panangadan | Greg Harris

[1] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[2] Paul R. Cohen,et al. Multiple Comparisons in Induction Algorithms , 2000, Machine Learning.

[3] Mark Harman,et al. Non-Recursive Beam Search on GPU for Formal Concept Analysis , 2011 .

[4] Patrick Meyer,et al. On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid , 2008, Eur. J. Oper. Res..

[5] Johannes Fürnkranz,et al. On the quest for optimal rule learning heuristics , 2010, Machine Learning.

[6] Jaideep Srivastava,et al. Selecting the right interestingness measure for association patterns , 2002, KDD.

[7] Bingsheng He,et al. Frequent itemset mining on graphics processors , 2009, DaMoN '09.

[8] Howard J. Hamilton,et al. Interestingness measures for data mining: A survey , 2006, CSUR.

[9] J. R. Quinlan. Learning Logical Definitions from Relations , 1990 .

[10] Ashwin Srinivasan,et al. Parameter Screening and Optimisation for ILP using Designed Experiments , 2011, J. Mach. Learn. Res..

[11] Johannes Fürnkranz,et al. ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.