ADDS: Adaptive Differentiable Sampling for Robust Multi-Party Learning

Distributed multi-party learning provides an effective approach for training a joint model with scattered data under legal and practical constraints. However, due to the quagmire of a skewed distribution of data labels across participants and the computation bottleneck of local devices, how to build smaller customized models for clients in various scenarios while providing updates appliable to the central model remains a challenge. In this paper, we propose a novel adaptive differentiable sampling framework (ADDS) for robust and communication-efficient multi-party learning. Inspired by the idea of dropout in neural networks, we introduce a network sampling strategy in the multi-party setting, which distributes different subnets of the central model to clients for updating, and the differentiable sampling rates allow each client to extract optimal local architecture from the supernet according to its private data distribution. The approach requires minimal modifications to the existing multi-party learning structure, and it is capable of integrating local updates of all subnets into the supernet, improving the robustness of the central model. The proposed framework significantly reduces local computation and communication costs while speeding up the central model convergence, as we demonstrated through experiments on real-world datasets.

[1]  Qinmin Yang,et al.  Lazily Aggregated Quantized Gradient Innovation for Communication-Efficient Federated Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ai-Chun Pang,et al.  Semisupervised Distributed Learning With Non-IID Data for AIoT Service Platform , 2020, IEEE Internet of Things Journal.

[3]  Meikang Qiu,et al.  Retraining Strategy-Based Domain Adaption Network for Intelligent Fault Diagnosis , 2020, IEEE Transactions on Industrial Informatics.

[4]  Jian Pei,et al.  Personalized Cross-Silo Federated Learning on Non-IID Data , 2020, AAAI.

[5]  Zhaoran Wang,et al.  On the Global Optimality of Model-Agnostic Meta-Learning , 2020, ICML.

[6]  Sebastian U. Stich,et al.  Ensemble Distillation for Robust Model Fusion in Federated Learning , 2020, NeurIPS.

[7]  Zhifei Zhang,et al.  Analyzing User-Level Privacy Attack Against Federated Learning , 2020, IEEE Journal on Selected Areas in Communications.

[8]  Nguyen H. Tran,et al.  Personalized Federated Learning with Moreau Envelopes , 2020, NeurIPS.

[9]  Yan Zhang,et al.  Blockchain and Federated Learning for Privacy-Preserved Data Sharing in Industrial IoT , 2020, IEEE Transactions on Industrial Informatics.

[10]  Rui Hu,et al.  Personalized Federated Learning With Differential Privacy , 2020, IEEE Internet of Things Journal.

[11]  Kuk-Jin Yoon,et al.  Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Yu Wang,et al.  DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation , 2020, ECCV.

[13]  Mehrdad Mahdavi,et al.  Adaptive Personalized Federated Learning , 2020, ArXiv.

[14]  Nancy Addo,et al.  Federated Machine Learning , 2020 .

[15]  Vitaly Shmatikov,et al.  Salvaging Federated Learning by Local Adaptation , 2020, ArXiv.

[16]  Wojciech Samek,et al.  Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning , 2019, Pattern Recognit..

[17]  D. Song,et al.  The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Mohsen Guizani,et al.  Reliable Federated Learning for Mobile Networks , 2019, IEEE Wireless Communications.

[19]  Junpu Wang,et al.  FedMD: Heterogenous Federated Learning via Model Distillation , 2019, ArXiv.

[20]  Wojciech Samek,et al.  Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Jakub Konecný,et al.  Improving Federated Learning Personalization via Model Agnostic Meta Learning , 2019, ArXiv.

[22]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[23]  Pan Zhou,et al.  A Privacy-Preserving Distributed Contextual Federated Online Learning Framework with Big Data Support in Social Recommender Systems , 2019, IEEE Transactions on Knowledge and Data Engineering.

[24]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[25]  Maria-Florina Balcan,et al.  Adaptive Gradient-Based Meta-Learning Methods , 2019, NeurIPS.

[26]  Lillian J. Ratliff,et al.  Convergence of Learning Dynamics in Stackelberg Games , 2019, ArXiv.

[27]  Yan Lu,et al.  Relational Knowledge Distillation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yong Luo,et al.  Transferring Knowledge Fragments for Learning Distance Metric from a Heterogeneous Domain , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Klaus-Robert Müller,et al.  Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, MLSys.

[31]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[32]  Yuanzhi Li,et al.  Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.

[33]  Trevor Darrell,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[34]  Sebastian Caldas,et al.  Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.

[35]  Guoliang Kang,et al.  Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks , 2018, IEEE Transactions on Cybernetics.

[36]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[37]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[38]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[39]  Le Wu,et al.  Predicting Aesthetic Score Distribution through Cumulative Jensen-Shannon Divergence , 2017, AAAI.

[40]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[42]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[43]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[44]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[45]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[46]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[47]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[48]  Kan Yang,et al.  VerifyNet: Secure and Verifiable Federated Learning , 2020, IEEE Transactions on Information Forensics and Security.

[49]  Yiyang Yao,et al.  Neural Compatibility Modeling With Probabilistic Knowledge Distillation , 2020, IEEE Transactions on Image Processing.

[50]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[51]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[52]  Vladimir Braverman,et al.  FetchSGD: Communication-Efficient Federated Learning with Sketching , 2022 .