Capabilities-based query rewriting in mediator systems

Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with all sources at a "lowest common denominator". This paper explores a third approach, in which a mediator ensures that sources receive queries they can handle, while still taking advantage of all of the query power of the source. We propose an architecture that enables this, and identify a key component of that architecture, the Capabilities-Based Rewriter (CBR). The CBR takes as input a description of the capabilities of a data source, and a query targeted for that data source. From these, the CBR determines component queries to be sent to the sources, commensurate with their abilities, and computes a plan for combining their results using joins, unions, selections, and projections. We provide a language to describe the query capability of data sources and a plan generation algorithm. Our description language and plan generation algorithm are schema independent and handle SPJ queries.

[1]  Guy M. Lohman,et al.  Query Optimization in the IBM DB2 Family. , 1993 .

[2]  Roger King,et al.  Amalgame: A Tool for Creating Interoperating, Persistent, Heterogeneous Components , 1993, Advanced Database Systems.

[3]  Anand Rajaraman,et al.  Answering Queries Using Limited External Processors. , 1996, ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems.

[4]  Divesh Srivastava,et al.  Answering Queries Using Views. , 1999, PODS 1995.

[5]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[6]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[7]  Per-Åke Larson,et al.  Computing Queries from Derived Relations , 1985, VLDB.

[8]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[9]  Jeffrey D. Ullman,et al.  Answering queries using limited external query processors (extended abstract) , 1996, PODS.

[10]  Yannis Papakonstantinou,et al.  Describing and Using Query Capabilities of Heterogeneous Sources , 1997, VLDB.

[11]  Mihalis Yannakakis,et al.  Equivalences Among Relational Expressions with the Union and Difference Operators , 1980, J. ACM.

[12]  Dennis McLeod,et al.  An Approach to Resolving Semantic Heterogenity in a Federation of Autonomous, Heterogeneous Database Systems , 1993, Int. J. Cooperative Inf. Syst..

[13]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[14]  Anand Rajaraman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS.

[15]  Xiaolei Qian,et al.  Query folding , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[16]  Patrick Valduriez,et al.  Scaling heterogeneous databases and the design of Disco , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[17]  Guy M. Lohman,et al.  Grammar-like functional rules for representing query optimization alternatives , 1988, SIGMOD '88.

[18]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[19]  Amar Gupta,et al.  Integration of Information Systems: Bridging Heterogeneous Databases , 1989 .

[20]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[21]  Weimin Du,et al.  The Pegasus heterogeneous multidatabase system , 1991, Computer.