Embedding Monte Carlo Search of Features in Tree-Based Ensemble Methods

Feature generation is the problem of automatically constructing good features for a given target learning problem. While most feature generation algorithms belong either to the filter or to the wrapper approach, this paper focuses on embedded feature generation. We propose a general scheme to embed feature generation in a wide range of tree-based learning algorithms, including single decision trees, random forests and tree boosting. It is based on the formalization of feature construction as a sequential decision making problem addressed by a tractable Monte Carlo search algorithm coupled with node splitting. This leads to fast algorithms that are applicable to large-scale problems. We empirically analyze the performances of these tree-based learners combined or not with the feature generation capability on several standard datasets.

[1]  Nicolas Jouandeau,et al.  Parallel Nested Monte-Carlo search , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[2]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[3]  Pieter Spronck,et al.  Monte-Carlo Tree Search: A New Framework for Game AI , 2008, AIIDE.

[4]  Krzysztof Krawiec,et al.  Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks , 2002, Genetic Programming and Evolvable Machines.

[5]  François Pachet,et al.  Analytical Features: A Knowledge-Based Approach to Audio Feature Generation , 2009, EURASIP J. Audio Speech Music. Process..

[6]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[8]  Tristan Cazenave,et al.  Nested Monte-Carlo Search , 2009, IJCAI.

[9]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[10]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Asoke K. Nandi,et al.  Feature generation using genetic programming with application to fault classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Erik D. Goodman,et al.  Genetic programming for improved data mining: application to the biochemistry of protein interactions , 1996 .

[14]  Balázs Kégl,et al.  Boosting products of base classifiers , 2009, ICML '09.

[15]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[16]  Larry A. Rendell,et al.  A Scheme for Feature Construction and a Comparison of Empirical Methods , 1991, IJCAI.

[17]  Shaul Markovitch,et al.  Feature Generation Using General Constructor Functions , 2002, Machine Learning.

[18]  Larry Bull,et al.  Genetic Programming with a Genetic Algorithm for Feature Construction and Selection , 2005, Genetic Programming and Evolvable Machines.

[19]  Manfred Jaeger,et al.  Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007) , 2007, ICML 2007.

[20]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[21]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[22]  Anikó Ekárt,et al.  Using genetic programming and decision trees for generating structural descriptions of four bar mechanisms , 2003, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.