Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation

We provide a detailed study of the estimation of probability distributions---discrete and continuous---in a stringent setting in which data is kept private even from the statistician. We give sharp minimax rates of convergence for estimation in these locally private settings, exhibiting fundamental tradeoffs between privacy and convergence rate, as well as providing tools to allow movement along the privacy-statistical efficiency continuum. One of the consequences of our results is that Warner's classical work on randomized response is an optimal way to perform survey sampling while maintaining privacy of the respondents.

[1]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[2]  Ivan P. Fellegi,et al.  On the Question of Statistical Confidentiality , 1972 .

[3]  D. W. Scott On optimal and data based histograms , 1979 .

[4]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[5]  Lucien Birgé Approximation dans les espaces métriques et théorie de l'estimation , 1983 .

[6]  P. Brucker Review of recent development: An O( n) algorithm for quadratic knapsack problems , 1984 .

[7]  George T. Duncan,et al.  Disclosure-Limited Data Dissemination , 1986 .

[8]  P. Hall,et al.  Optimal Rates of Convergence for Deconvolving a Density , 1988 .

[9]  D. Lambert,et al.  The Risk of Disclosure for Microdata , 1989 .

[10]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[11]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[12]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[13]  Stephen E. Fienberg,et al.  Disclosure limitation using perturbation and related methods for categorical data , 1998 .

[14]  Yuhong Yang,et al.  Information-theoretic determination of minimax rates of convergence , 1999 .

[15]  Lianfen Qian,et al.  Nonparametric Curve Estimation: Methods, Theory, and Applications , 1999, Technometrics.

[16]  Thomas Kühn,et al.  A Lower Estimate for Entropy Numbers , 2001, J. Approx. Theory.

[17]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[18]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[19]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[20]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[21]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[22]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[23]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[24]  Eran Omri,et al.  Distributed Private Data Analysis: On Simultaneously Solving How and What , 2008, CRYPTO.

[25]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[26]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[27]  Adam D. Smith,et al.  Privacy-preserving statistical estimation with optimal convergence rates , 2011, STOC '11.

[28]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[29]  Jing Lei,et al.  Differentially Private M-Estimators , 2011, NIPS.

[30]  Anindya De,et al.  Lower Bounds in Differential Privacy , 2011, TCC.

[31]  Kamalika Chaudhuri,et al.  Convergence Rates for Differentially Private Statistical Estimation , 2012, ICML.

[32]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[33]  Martin J. Wainwright,et al.  Local Privacy, Data Processing Inequalities, and Statistical Minimax Rates , 2013, 1302.3203.

[34]  Michael I. Jordan,et al.  Matrix concentration inequalities via the method of exchangeable pairs , 2012, 1201.6002.

[35]  Martin J. Wainwright,et al.  Privacy Aware Learning , 2012, JACM.