Finite Time Analysis of Stratified Sampling for Monte Carlo

We consider the problem of stratified sampling for Monte-Carlo integration. We model this problem in a multi-armed bandit setting, where the arms represent the strata, and the goal is to estimate a weighted average of the mean values of the arms. We propose a strategy that samples the arms according to an upper bound on their standard deviations and compare its estimation quality to an ideal allocation that would know the standard deviations of the strata. We provide two regret analyses: a distribution-dependent bound O(n-3/2) that depends on a measure of the disparity of the strata, and a distribution-free bound O(n-4/3) that does not.

[1]  V. V. Buldygin,et al.  Sub-Gaussian random variables , 1980 .

[2]  Dirk P. Kroese,et al.  Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics) , 1981 .

[3]  J. Hammersley SIMULATION AND THE MONTE CARLO METHOD , 1982 .

[4]  S. Resnick A Probability Path , 1999 .

[5]  P. Glasserman,et al.  Asymptotically Optimal Importance Sampling and Stratification for Pricing Path‐Dependent Options , 1999 .

[6]  Paul Glasserman,et al.  Monte Carlo Methods in Financial Engineering , 2003 .

[7]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[8]  Bouhari Arouna,et al.  Adaptative Monte Carlo Method, A Variance Reduction Technique , 2004, Monte Carlo Methods Appl..

[9]  P. Etoré,et al.  Adaptive Optimal Allocation in Stratified Sampling Methods , 2007, 0711.4514.

[10]  J. Gentle Simulation and the Monte Carlo Method, 2nd edition by RUBINSTEIN, R. Y. and KROESE, D. P. , 2008 .

[11]  Jean-Yves Audibert,et al.  Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.

[12]  Massimiliano Pontil,et al.  Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[13]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[14]  Reiichiro Kawai,et al.  Asymptotically optimal allocation of stratified sampling with adaptive variance reduction by strata , 2010, TOMC.

[15]  Varun Grover,et al.  Active learning in heteroscedastic noise , 2010, Theor. Comput. Sci..

[16]  Alessandro Lazaric,et al.  Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits , 2011, ALT.

[17]  Gersende Fort,et al.  On adaptive stratification , 2011, Ann. Oper. Res..