论文信息 - Communication-efficient distributed statistical learning

Communication-efficient distributed statistical learning

We present the Communication-efficient Surrogate Likelihood (CSL) framework for solving distributed statistical learning problems. CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation and Bayesian inference. For low-dimensional estimation, CSL provably improves upon the averaging schemes and facilitates the construction of confidence intervals. For high-dimensional regularized estimation, CSL leads to a minimax optimal estimator with minimal communication cost. For Bayesian inference, CSL can be used to form a communication-efficient quasi-posterior distribution that converges to the true posterior. This quasi-posterior procedure significantly improves the computational efficiency of MCMC algorithms even in a non-distributed setting. The methods are illustrated through empirical studies.

Yun Yang | Michael I. Jordan | Jason D. Lee | J. Lee | Yun Yang

[1] V. Chernozhukov,et al. An MCMC Approach to Classical Estimation , 2002, 2301.07782.

[2] Pier Giovanni Bissiri,et al. A general framework for updating belief distributions , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[3] Martin J. Wainwright,et al. Iterative Hessian Sketch: Fast and Accurate Solution Approximation for Constrained Least-Squares , 2014, J. Mach. Learn. Res..

[4] Sébastien Bubeck,et al. Theory of Convex Optimization for Machine Learning , 2014, ArXiv.

[5] Mladen Kolar,et al. Efficient Distributed Learning with Sparsity , 2016, ICML.

[6] Martin J. Wainwright,et al. Optimality guarantees for distributed statistical estimation , 2014, 1405.0782.

[7] S. Geer,et al. On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[8] David P. Woodruff,et al. Communication lower bounds for statistical estimation problems via a distributed data processing inequality , 2015, STOC.

[9] Gersende Fort,et al. A Shrinkage-Thresholding Metropolis Adjusted Langevin Algorithm for Bayesian Variable Selection , 2013, IEEE Journal of Selected Topics in Signal Processing.

[10] Santosh S. Vempala,et al. Principal Component Analysis and Higher Correlations for Distributed Data , 2013, COLT.

[11] Martin J. Wainwright,et al. Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[12] Marcelo Pereyra,et al. Proximal Markov chain Monte Carlo algorithms , 2013, Statistics and Computing.

[13] Tengyu Ma,et al. On Communication Cost of Distributed Statistical Estimation and Dimensionality , 2014, NIPS.

[14] Qiang Liu,et al. Communication-efficient sparse regression: a one-shot approach , 2015, ArXiv.

[15] Martin J. Wainwright,et al. Restricted Eigenvalue Properties for Correlated Gaussian Designs , 2010, J. Mach. Learn. Res..

[16] Yuchen Zhang,et al. Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss , 2015, ArXiv.

[17] Xiangyu Wang,et al. Parallelizing MCMC via Weierstrass Sampler , 2013, 1312.4605.

[18] Martin J. Wainwright,et al. Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[19] Martin J. Wainwright,et al. A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[20] Chong Wang,et al. Asymptotically Exact, Embarrassingly Parallel MCMC , 2013, UAI.

[21] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.