Empirical Testing of Fast Kernel Density Estimation Algorithms

We present results of experiments testing the Fast Gauss Transform, Improved Fast Gauss Transform, and Dual-Tree methods (using kd-tree and Anchors Hierarchy data structures) for fast Kernel Density Estimation (KDE). We examine the performance of these methods with respect to data set size, dimension, allowable error, and data set structure (“clumpiness”), measured in terms of CPU time and memory usage. This is the first multi-method comparison in the literature. The results are striking, challenging several claims that are commonly made about these methods. The results are useful for researchers considering fast methods for KDE problems. Along the way, we provide a corrected error bound and a parameter-selection regime for the IFGT algorithm.