Learning to rank with extremely randomized trees

In this paper, we report on our experiments on the Yahoo! Labs Learning to Rank challenge organized in the context of the 23rd International Conference of Machine Learning (ICML 2010). We competed in both the learning to rank and the transfer learning tracks of the challenge with several tree-based ensemble methods, including Tree Bagging (Breiman, 1996), Random Forests (Breiman, 2001), and Extremely Randomized Trees (Geurts et al., 2006). Our methods ranked 10th in the first track and 4th in the second track. Although not at the very top of the ranking, our results show that ensembles of randomized trees are quite competitive for the "learning to rank" problem. The paper also analyzes computing times of our algorithms and presents some post-challenge experiments with transfer learning methods.

[1]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[2]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  Jianfeng Gao,et al.  Ranking, Boosting, and Model Adaptation , 2008 .

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[8]  C. Burges,et al.  Learning to Rank Using Classification and Gradient Boosting , 2008 .

[9]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[10]  Larry P. Heck,et al.  Trada: tree based ranking function adaptation , 2008, CIKM '08.

[11]  Yi Su,et al.  Model Adaptation via Model Interpolation and Boosting for Web Search Ranking , 2009, EMNLP.

[12]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[13]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[14]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[15]  Zhaohui Zheng,et al.  Stochastic gradient boosted distributed decision trees , 2009, CIKM.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.