Database Theory — ICDT 2001

It is a folk result in database theory that SQL cannot express recursive queries such as reachability; in fact, a new construct was added to SQL3 to overcome this limitation. However, the evidence for this claim is usually given in the form of a reference to a proof that relational algebra cannot express such queries. SQL, on the other hand, in all its implementations has three features that fundamentally distinguish it from relational algebra: namely, grouping, arithmetic operations, and aggregation. In the past few years, most questions about the additional power provided by these features have been answered. This paper surveys those results, and presents new simple and self-contained proofs of the main results on the expressive power of SQL. Somewhat surprisingly, tiny differences in the language definition affect the results in a dramatic way: under some very natural assumptions, it can be proved that SQL cannot define recursive queries, no matter what aggregate functions and arithmetic operations are allowed. But relaxing these assumptions just a tiny bit makes the problem of proving expressivity bounds for SQL as hard as some long-standing open problems in complexity theory.

[1]  D. T. Lee,et al.  On the maximum empty rectangle problem , 1984, Discret. Appl. Math..

[2]  Jean-François Boulicaut,et al.  Towards the reverse engineering of renormalized relational databases , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[3]  Renée J. Miller,et al.  Mining for Empty Rectangles in Large Data Sets , 2001, ICDT.

[4]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[5]  Georg Gottlob Computing covers for embedded functional dependencies , 1987, PODS '87.

[6]  Mark Levene,et al.  Axiomatisation of Functional Dependencies in Incomplete Relations , 1998, Theor. Comput. Sci..

[7]  Mark Levene,et al.  A Lattice View of Functional Dependencies in Incomplete Relations , 1995, Acta Cybern..

[8]  Hannu Toivonen,et al.  Efficient discovery of functional and approximate dependencies using partitions , 1998, Proceedings 14th International Conference on Data Engineering.

[9]  Qi Cheng,et al.  Implementation of Two Semantic Query Optimization Techniques in DB2 Universal Database , 1999, VLDB.

[10]  Heikki Mannila,et al.  Approximate Dependency Inference from Relations , 1992, ICDT.

[11]  Heikki Mannila,et al.  Algorithms for Inferring Functional Dependencies from Relations , 1994, Data Knowl. Eng..

[12]  Michel A. Melkanoff,et al.  A Method for Helping Discover the Dependencies of a Relation , 1979, Advances in Data Base Theory.

[13]  Peter J. Haas,et al.  The New Jersey Data Reduction Report , 1997 .

[14]  Heikki Mannila,et al.  Design by Example: An Application of Armstrong Relations , 1986, J. Comput. Syst. Sci..

[15]  Bernard Chazelle,et al.  Computing the Largest Empty Rectangle , 1984, SIAM J. Comput..

[16]  Heikki Mannila,et al.  On the Complexity of Inferring Functional Dependencies , 1992, Discret. Appl. Math..

[17]  Jean-Marc Petit,et al.  Efficient Discovery of Functional Dependencies and Armstrong Relations , 2000, EDBT.

[18]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[19]  M. Orlowski,et al.  A new algorithm for the largest empty rectangle problem , 1990, Algorithmica.

[20]  Wynne Hsu,et al.  Discovering Interesting Holes in Data , 1997, IJCAI.

[21]  Heikki Mannila,et al.  Discovering functional and inclusion dependencies in relational databases , 1992, Int. J. Intell. Syst..

[22]  Nicolas Spyratos The partition model: a deductive database model , 1987, TODS.

[23]  H. V. Jagadish,et al.  Semantic Compression and Pattern Extraction with Fascicles , 1999, VLDB.

[24]  Heikki Mannila,et al.  Design of Relational Databases , 1992 .

[25]  Peter A. Flach,et al.  Bottom-up induction of functional dependencies from relations , 1993 .

[26]  Mark Levene,et al.  Database design for incomplete relations , 1999, TODS.

[27]  Ke Wang,et al.  Using Decision Tree Induction for Discovering Holes in Data , 1998, PRICAI.

[28]  Hannu Toivonen,et al.  TANE: An Efficient Algorithm for Discovering Functional and Approximate Dependencies , 1999, Comput. J..

[29]  PhD Mark Levene BSc,et al.  A Guided Tour of Relational Databases and Beyond , 1999, Springer London.

[30]  Georg Gottlob,et al.  Investigations on Armstrong relations, dependency inference, and excluded functional dependencies , 1990, Acta Cybern..

[31]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[32]  Johann A. Makowsky,et al.  Identifying Extended Entity-Relationship Object Structures in Relational Schemas , 1990, IEEE Trans. Software Eng..

[33]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[34]  Xiaolei Qian,et al.  Query folding , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[35]  Jarek Gryz,et al.  Query folding with inclusion dependencies , 1998, Proceedings 14th International Conference on Data Engineering.