Random Forests for Regression as a Weighted Sum of ${k}$ -Potential Nearest Neighbors

In this paper, we tackle the problem of random forests for regression expressed as weighted sums of datapoints. We study the theoretical behavior of <inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula>-potential nearest neighbors (<inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula>-PNNs) under bagging and obtain an upper bound on the weights of a datapoint for random forests with any type of splitting criterion, provided that we use unpruned trees that stop growing only when there are <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> or less datapoints at their leaves. Moreover, we use the previous bound together with the concept of b-terms (i.e., bootstrap terms) introduced in this paper, to derive the explicit expression of weights for datapoints in a random (<inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula>-PNNs) selection setting, a datapoint selection strategy that we also introduce and to build a framework to derive other bagged estimators using a similar procedure. Finally, we derive from our framework the explicit expression of weights of a regression estimate equivalent to a random forest regression estimate with the random splitting criterion and demonstrate its equivalence both theoretically and practically.

[1]  Arnaud Guyader,et al.  On the Rate of Convergence of the Bagged Nearest Neighbor Estimate , 2010, J. Mach. Learn. Res..

[2]  Cesare Furlanello,et al.  Exact Bagging with k-Nearest Neighbour Classifiers , 2004, Multiple Classifier Systems.

[3]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[4]  Noureddine El Karoui,et al.  Can we trust the bootstrap in high-dimension? , 2016, 1608.00696.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Adele Cutler,et al.  PERT – Perfect Random Tree Ensembles , 2001 .

[7]  O. Barndorfi-nielsen,et al.  On the distribution of the number of admissible points in a vector , 1966 .

[8]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[9]  R. Samworth Optimal weighted nearest neighbour classifiers , 2011, 1101.5783.

[10]  Luc Devroye,et al.  On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification , 2010, J. Multivar. Anal..

[11]  Erwan Scornet,et al.  A random forest guided tour , 2015, TEST.

[12]  Hsien-Kuei Hwang,et al.  Maxima in hypercubes , 2005, Random Struct. Algorithms.

[13]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Small Sample Performance , 1952 .

[14]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Brian M. Steele,et al.  Exact bootstrap k-nearest neighbor learners , 2009, Machine Learning.