Ranges and Trackers in Statistical Databases

The goal of statistical databases is to provide statistics about groups of individuals while protecting their privacy. Sometimes, by correlating enough statistics, sensitive data about individual can be inferred. The problem of protecting against such indirect disclosures of confidential data is called the inference problem and a protecting mechanism — an inference control. A good inference control mechanism should be effective (it should provide security to a reasonable extent) and feasible (a practical way exists to enforce it). At the same time it should retain the richness of the information revealed to the users. During the last few years several techniques were developed for controlling inferences. One of the earliest inference controls for statistical databases restricts the responses computed over too small or too large query-sets. However, this technique is easily subverted. In the previous paper (see [5]) we proposed a new query-set size inference control which is based on the idea of multiranges and has better performance then the original one. In this paper we go further investigating the consequences of non-uniform distribution of ranges, for which queries are unanswerable.