The Sparse Vector Technique, Revisited

We revisit one of the most basic and widely applicable techniques in the literature of differential privacy - the sparse vector technique [Dwork et al., STOC 2009]. Loosely speaking, this technique allows us to privately test whether the value of a given query is close to what we expect it would be (w.r.t. the input database), where we are allowed to test an unbounded number of queries as long as their value is indeed close to what we expected. After the first time in which this is not the case, the process halts. We present a modification to the sparse vector technique that allows for a more fine-tuned privacy analysis. As a result, in some cases we are able to continue with the process of testing queries even after the first time in which the value of the query did not meet our expectations. We demonstrate our technique by applying it to the shifting-heavy-hitters problem: On every time step, each of n users gets a new input, and the task is to privately identify all the current heavy-hitters. That is, on time step i, the goal is to identify all data elements x such that many of the users have x as their current input. We present an algorithm for this problem with improved error guarantees over what can be obtained using existing techniques. Specifically, the error of our algorithm depends on the maximal number of times that a singe user holds a heavy-hitter as input, rather than the total number of times in which a heavy-hitter exists.

[1]  Thomas Steinke,et al.  Between Pure and Approximate Differential Privacy , 2015, J. Priv. Confidentiality.

[2]  Avrim Blum,et al.  Privacy-Preserving Public Information for Sequential Games , 2014, ITCS.

[3]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[4]  Raef Bassily,et al.  Model-Agnostic Private Learning , 2018, NeurIPS.

[5]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[6]  Jonathan Ullman,et al.  Private Multiplicative Weights Beyond Linear Queries , 2014, PODS.

[7]  Wanrong Zhang,et al.  Privately detecting changes in unknown distributions , 2020, ICML.

[8]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[9]  Aaron Roth,et al.  Differentially private combinatorial optimization , 2009, SODA '10.

[10]  Ashwin Machanavajjhala,et al.  On the Privacy Properties of Variants on the Sparse Vector Technique , 2015, ArXiv.

[11]  Ninghui Li,et al.  Understanding the Sparse Vector Technique for Differential Privacy , 2016, Proc. VLDB Endow..

[12]  Raef Bassily,et al.  Privately Answering Classification Queries in the Agnostic PAC Model , 2019, ALT.

[13]  Vitaly Feldman,et al.  Individual Privacy Accounting via a Renyi Filter , 2020, NeurIPS.

[14]  Haim Kaplan,et al.  Privately Learning Thresholds: Closing the Exponential Gap , 2019, COLT.

[15]  Haim Kaplan,et al.  How to Find a Point in the Convex Hull Privately , 2020, Symposium on Computational Geometry.

[16]  Seth Neel,et al.  Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM , 2017, NIPS.

[17]  Benjamin Grégoire,et al.  Proving Differential Privacy via Probabilistic Couplings , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[18]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[19]  Pierre-Yves Strub,et al.  Advanced Probabilistic Couplings for Differential Privacy , 2016, CCS.

[20]  Thomas Steinke,et al.  Generalization for Adaptively-chosen Estimators via Stable Median , 2017, COLT.

[21]  Aaron Roth,et al.  Privacy and Truthful Equilibrium Selection for Aggregative Games , 2014, WINE.

[22]  Haim Kaplan,et al.  Adversarially Robust Streaming Algorithms via Differential Privacy , 2020, NeurIPS.

[23]  Or Sheffet,et al.  An Optimal Private Stochastic-MAB Algorithm Based on an Optimal Private Stopping Rule , 2019, ICML.

[24]  Kobbi Nissim,et al.  Locating a Small Cluster Privately , 2016, PODS.

[25]  Justin Hsu,et al.  Differential privacy for the analyst via private equilibrium computation , 2012, STOC '13.

[26]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[27]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[28]  Kobbi Nissim,et al.  Clustering Algorithms for the Centralized and Local Models , 2017, ALT.

[29]  Thomas Steinke,et al.  Make Up Your Mind: The Price of Online Queries in Differential Privacy , 2016, SODA.