论文信息 - Robustness Guarantees for Mode Estimation with an Application to Bandits

Robustness Guarantees for Mode Estimation with an Application to Bandits

Mode estimation is a classical problem in statistics with a wide range of applications in machine learning. Despite this, there is little understanding in its robustness properties under possibly adversarial data contamination. In this paper, we give precise robustness guarantees as well as privacy guarantees under simple randomization. We then introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean. We prove regret guarantees for the problems of top arm identification, top m-arms identification, contextual modal bandits, and infinite continuous arms top arm recovery. We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences, thus rendering modal bandits an attractive choice in situations where the rewards may have outliers or adversarial corruptions.

Michael I. Jordan | Aldo Pacchiano | Heinrich Jiang | Aldo Pacchiano | Heinrich Jiang

[1] Jun Sakuma,et al. Differentially Private Analysis of Outliers , 2015, ECML/PKDD.

[2] L. Wasserman,et al. Nonparametric modal regression , 2014, 1412.1716.

[3] Samory Kpotufe,et al. Modal-set estimation with an application to clustering , 2016, AISTATS.

[4] Vaithianathan Venkatasubramanian,et al. Electromechanical mode estimation using recursive adaptive stochastic subspace identification , 2014, 2014 IEEE PES T&D Conference and Exposition.

[5] Sanjoy Dasgupta,et al. Rates of convergence for the cluster tree , 2010, NIPS.

[6] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.

[7] Yizong Cheng,et al. Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Robert D. Nowak,et al. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).

[9] Brian C. Williams,et al. Mode Estimation of Probabilistic Hybrid Systems , 2002, HSCC.

[10] Philippe Vieu,et al. A note on density mode estimation , 1996 .

[11] Sanjoy Dasgupta,et al. Optimal rates for k-NN density and mode estimation , 2014, NIPS.

[12] Stefano Soatto,et al. Quick Shift and Kernel Methods for Mode Seeking , 2008, ECCV.

[13] Brian C. Williams,et al. Mode Estimation of Model-based Programs: Monitoring Systems with Complex Behavior , 2001, IJCAI.

[14] Eyke Hüllermeier,et al. Qualitative Multi-Armed Bandits: A Quantile-Based Approach , 2015, ICML.

[15] B. Silverman,et al. Using Kernel Density Estimates to Investigate Multimodality , 1981 .

[16] Abdelaziz Benallegue,et al. Backstepping control with exact 2-sliding mode estimation for a quadrotor unmanned aerial vehicle , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] Moni Naor,et al. Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[18] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .

[19] Prachi Shah,et al. Comparison of mode estimation methods and application in molecular clock analysis , 2003, BMC Bioinformatics.

[20] Robert T. Collins,et al. Mean-shift blob tracking through scale space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[21] H. Chernoff. Estimation of the mode , 1964 .

[22] Jian Li,et al. Practical Algorithms for Best-K Identification in Multi-Armed Bandits , 2017, ArXiv.

[23] Hai Jin,et al. Color Image Segmentation Based on Mean Shift and Normalized Cuts , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24] H. Robbins. Some aspects of the sequential design of experiments , 1952 .

[25] Takeo Kanade,et al. Mode-seeking by Medoidshifts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26] Aaron Roth,et al. The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[27] Cynthia Dwork,et al. Differential privacy and robust statistics , 2009, STOC '09.

[28] Hajime Yamato,et al. SEQUENTIAL ESTIMATION OF A CONTINUOUS PROBABILITY DENSITY FUNCTION AND MODE , 1971 .

[29] Dominik D. Freydenberger,et al. Can We Learn to Gamble Efficiently? , 2010, COLT.

[30] Jill M. Boyce,et al. Fast mode decision and motion estimation for JVT/H.264 , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[31] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.

[32] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..