Test-time Collective Prediction

An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents to make better predictions than they would individually, but may not be willing to release labeled data or model parameters. In this work, we explore a decentralized mechanism to make collective predictions at test time, that is inspired by the literature in social science on human consensus-making. Building on a query model to facilitate information exchange among agents, our approach leverages each agent’s pre-trained model without relying on external validation, model retraining, or data pooling. A theoretical analysis shows that our approach recovers inverse mean-squared-error (MSE) weighting in the large-sample limit which is known to be the optimal way to combine independent, unbiased estimators. Empirically, we demonstrate that our scheme effectively combines models with differing quality across the input space: the proposed consensus prediction achieves significant gains over classical model averaging, and even outperforms weighted averaging schemes that have access to additional validation data. Finally, we propose a decentralized Jackknife procedure as a tool to evaluate the sensitivity of the collective predictions with respect to a single agent’s opinion.

[1]  R. Durrett Probability: Theory and Examples , 1993 .

[2]  Peter A. Morris,et al.  Combining Expert Judgments: A Bayesian Approach , 1977 .

[3]  B. Efron Estimation and Accuracy After Model Selection , 2014, Journal of the American Statistical Association.

[4]  R. Cooke Experts in Uncertainty: Opinion and Subjective Probability in Science , 1991 .

[5]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[6]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[7]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[8]  Noah A. Smith,et al.  Predicting Risk from Financial Reports with Regression , 2009, NAACL.

[9]  Amin Karbasi,et al.  Federated Functional Gradient Boosting , 2021, ArXiv.

[10]  George D. C. Cavalcanti,et al.  Prototype selection for dynamic classifier and ensemble selection , 2016, Neural Computing and Applications.

[11]  Marco Henrique de Almeida Inácio,et al.  The NN-Stacking: Feature weighted linear stacking through neural networks , 2020, Neurocomputing.

[12]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[13]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Galina L. Rogova Combining the Results of Several Neural Network Classifiers , 2008, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[17]  Martin J. Wainwright,et al.  Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[18]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[19]  Claudio Altafini,et al.  Consensus Problems on Networks With Antagonistic Interactions , 2013, IEEE Transactions on Automatic Control.

[20]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[21]  Ramesh Raskar,et al.  Distributed learning of deep neural network over multiple agents , 2018, J. Netw. Comput. Appl..

[22]  Yifan Zhang,et al.  Combining Experts’ Judgments: Comparison of Algorithmic Methods Using Synthetic Data , 2013, Risk analysis : an official publication of the Society for Risk Analysis.

[23]  Kaplan,et al.  ‘Combining Probability Distributions from Experts in Risk Analysis’ , 2000, Risk analysis : an official publication of the Society for Risk Analysis.

[24]  A. Buja,et al.  OBSERVATIONS ON BAGGING , 2006 .

[25]  Gian Luca Marcialis,et al.  A study on the performances of dynamic classifier selection based on local accuracy estimation , 2005, Pattern Recognit..

[26]  Michael I. Jordan,et al.  Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.

[27]  Martin Jaggi,et al.  Model Fusion via Optimal Transport , 2019, NeurIPS.

[28]  Olle Häggström Finite Markov Chains and Algorithmic Applications , 2002 .

[29]  M. Stone The Opinion Pool , 1961 .

[30]  M. Burgman,et al.  The Value of Performance Weights and Discussion in Aggregated Expert Judgments , 2018, Risk analysis : an official publication of the Society for Risk Analysis.

[31]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[32]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[33]  B. Sinha,et al.  Statistical Meta-Analysis with Applications , 2008 .

[34]  Rainer Hegselmann,et al.  Opinion dynamics and bounded confidence: models, analysis and simulation , 2002, J. Artif. Soc. Soc. Simul..

[35]  Francesco Bullo,et al.  How truth wins in opinion dynamics along issue sequences , 2017, Proceedings of the National Academy of Sciences.

[36]  W. Aspinall A route to more tractable expert advice , 2010, Nature.

[37]  E. Seneta,et al.  Towards consensus: some convergence theorems on repeated averaging , 1977, Journal of Applied Probability.

[38]  M. Degroot Reaching a Consensus , 1974 .

[39]  John Klein,et al.  SPOCC: Scalable POssibilistic Classifier Combination - toward robust aggregation of classifiers , 2019, Expert Syst. Appl..

[40]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[41]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[42]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[43]  M. H. Quenouille Approximate Tests of Correlation in Time‐Series , 1949 .

[44]  Wasserman,et al.  Bayesian Model Selection and Model Averaging. , 2000, Journal of mathematical psychology.

[45]  Mikhail Iu. Leontev,et al.  Non-iterative Knowledge Fusion in Deep Convolutional Neural Networks , 2018, Neural Processing Letters.

[46]  N. Dalkey STUDIES IN THE QUALITY OF LIFE; DELPHI AND DECISION-MAKING. , 1972 .

[47]  J. M. Bates,et al.  The Combination of Forecasts , 1969 .

[48]  Michael Reinhard Rational Consensus In Science And Society , 2016 .

[49]  George D. C. Cavalcanti,et al.  Dynamic classifier selection: Recent advances and perspectives , 2018, Inf. Fusion.