Distributed Robust Learning

We propose a framework for distributed robust statistical learning on {\em big contaminated data}. The Distributed Robust Learning (DRL) framework can reduce the computational time of traditional robust learning methods by several orders of magnitude. We analyze the robustness property of DRL, showing that DRL not only preserves the robustness of the base robust learning method, but also tolerates contaminations on a constant fraction of results from computing nodes (node failures). More precisely, even in presence of the most adversarial outlier distribution over computing nodes, DRL still achieves a breakdown point of at least $ \lambda^*/2 $, where $ \lambda^* $ is the break down point of corresponding centralized algorithm. This is in stark contrast with naive division-and-averaging implementation, which may reduce the breakdown point by a factor of $ k $ when $ k $ computing nodes are used. We then specialize the DRL framework for two concrete cases: distributed robust principal component analysis and distributed robust regression. We demonstrate the efficiency and the robustness advantages of DRL through comprehensive simulations and predicting image tags on a large-scale image set.

[1]  J. Haldane Note on the median of a multivariate distribution , 1948 .

[2]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[3]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[4]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[5]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[6]  B. Ripley,et al.  Robust Statistics , 2018, Wiley Series in Probability and Statistics.

[7]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[8]  Stephen P. Boyd,et al.  Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.

[9]  V. Yohai,et al.  Robust Estimation of Multivariate Location and Scatter , 2006 .

[10]  Ameet Talwalkar,et al.  Divide-and-Conquer Matrix Factorization , 2011, NIPS.

[11]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[12]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[13]  Martin J. Wainwright,et al.  Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[14]  Shuicheng Yan,et al.  Robust PCA in High-dimension: A Deterministic Approach , 2012, ICML.

[15]  Martin J. Wainwright,et al.  Divide and Conquer Kernel Ridge Regression , 2013, COLT.

[16]  Shie Mannor,et al.  Outlier-Robust PCA: The High-Dimensional Case , 2013, IEEE Transactions on Information Theory.

[17]  Shie Mannor,et al.  Robust Sparse Regression under Adversarial Corruption , 2013, ICML.

[18]  Shuicheng Yan,et al.  Online Robust PCA via Stochastic Optimization , 2013, NIPS.

[19]  Constantine Caramanis,et al.  Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery , 2013, ICML.

[20]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[21]  Stanislav Minsker Geometric median and robust estimation in Banach spaces , 2013, 1308.1334.