Differentially Private Regression and Classification with Sparse Gaussian Processes

A continuing challenge for machine learning is providing methods to perform computation on data while ensuring the data remains private. In this paper we build on the provable privacy guarantees of differential privacy which has been combined with Gaussian processes through the previously published \emph{cloaking method}. In this paper we solve several shortcomings of this method, starting with the problem of predictions in regions with low data density. We experiment with the use of inducing points to provide a sparse approximation and show that these can provide robust differential privacy in outlier areas and at higher dimensions. We then look at classification, and modify the Laplace approximation approach to provide differentially private predictions. We then combine this with the sparse approximation and demonstrate the capability to perform classification in high dimensions. We finally explore the issue of hyperparameter selection and develop a method for their private selection. This paper and associated libraries provide a robust toolkit for combining differential privacy and GPs in a practical manner.

[1]  R. Hardwarsing Stochastic Gradient Descent with Differentially Private Updates , 2018 .

[2]  Neil J. Hurley,et al.  Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches , 2018, WSDM.

[3]  Kamalika Chaudhuri,et al.  A Stability-based Validation Procedure for Differentially Private Machine Learning , 2013, NIPS.

[4]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[5]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[6]  Neil D. Lawrence,et al.  Differentially Private Gaussian Processes , 2016, ArXiv.

[7]  Xiaoqian Jiang,et al.  Privacy Preserving RBF Kernel Support Vector Machine , 2014, BioMed research international.

[8]  Roman Garnett,et al.  Differentially Private Bayesian Optimization , 2015, ICML.

[9]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[10]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[11]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[12]  Md Zahidul Islam,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Differentially Private Random Decision Forests Using Smooth Sensitivity , 2022 .

[13]  Tianqing Zhu,et al.  An effective privacy preserving algorithm for neighborhood-based collaborative filtering , 2014, Future Gener. Comput. Syst..

[14]  Basit Shafiq,et al.  Differentially Private Naive Bayes Classification , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[15]  Mark J. Schervish,et al.  Nonstationary Covariance Functions for Gaussian Process Regression , 2003, NIPS.

[16]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[17]  Harald Scheule,et al.  Credit Risk Analytics: The R Companion , 2017 .

[18]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[19]  Jing Lei,et al.  Differentially Private M-Estimators , 2011, NIPS.

[20]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[21]  Andrew Gordon Wilson,et al.  Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces , 2015, AISTATS.

[22]  Rebecca N. Wright,et al.  A Practical Differentially Private Random Decision Tree Classifier , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[23]  Ling Huang,et al.  Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.

[24]  Larry A. Wasserman,et al.  Differential privacy for functions and functional data , 2012, J. Mach. Learn. Res..

[25]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[26]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.