Approximate Query Processing Using Multilayered Data Model to Handle Environmental Constraints, Privacy and Avoiding Inferences

In this paper, we describe a query approximation system which uses the Multi-Layered Database (MLDB), a collection of summarized relational data generated using domain-based concept hierarchies. The system generates approximate answers to queries to handle environmental constraints and access control levels, thus preserving the privacy and security of data. Using concept hierarchy (CH), we generalize attributes to transform base relations to different layers of summarized relations corresponding to access control levels. The summary databases thus formed are the compression of the tuples in the main database using the CH constructed using the domain set. The query is rewritten by traversing the MLDB layers according to the user's access control level. We present summarization methods, query rewriting algorithms, implementation and experimental results of the system. In addition, we analyze some of the known inferences in Multi Level Secure (MLS) databases and then proceed to explore their effects on an approximate query processor that uses the MLDB model. The common relationships among inferential queries are found by analyzing them, and are used in possible solutions to detect and prevent inference problems. These patches are added to the query processor in MLDB to form a system that provides approximate results by preserving privacy and at the same time block the possible inferences. We have observed that these extra patches introduce only very small overheads in the MLDB generation and query processing.