Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning

One of the most challenging aspects of managing a very large data warehouse is identifying how queries will behave before they start executing. Yet knowing their performance characteristics --- their runtimes and resource usage --- can solve two important problems. First, every database vendor struggles with managing unexpectedly long-running queries. When these long-running queries can be identified before they start, they can be rejected or scheduled when they will not cause extreme resource contention for the other queries in the system. Second, deciding whether a system can complete a given workload in a given time period (or a bigger system is necessary) depends on knowing the resource requirements of the queries in that workload. We have developed a system that uses machine learning to accurately predict the performance metrics of database queries whose execution times range from milliseconds to hours. For training and testing our system, we used both real customer queries and queries generated from an extended set of TPC-DS templates. The extensions mimic queries that caused customer problems. We used these queries to compare how accurately different techniques predict metrics such as elapsed time, records used, disk I/Os, and message bytes. The most promising technique was not only the most accurate, but also predicted these metrics simultaneously and using only information available prior to query execution. We validated the accuracy of this machine learning technique on a number of HP Neoview configurations. We were able to predict individual query elapsed time within 20% of its actual time for 85% of the test queries. Most importantly, we were able to correctly identify both the short and long-running (up to two hour) queries to inform workload management and capacity planning.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  Philip S. Yu,et al.  On Workload Characterization of Relational Database Environments , 1992, IEEE Trans. Software Eng..

[5]  D. Patterson,et al.  Performance characterization of a quad Pentium Pro SMP using OLTP workloads , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[6]  S. Parekh,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[7]  Martin F. Arlitt,et al.  Characterizing Web user sessions , 2000, PERV.

[8]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[9]  Said Elnaffar,et al.  Automatically classifying database workloads , 2002, CIKM '02.

[10]  Volker Markl,et al.  Learning table access cardinalities with LEO , 2002, SIGMOD '02.

[11]  Lieven Eeckhout,et al.  How input data sets change program behaviour , 2002, HPCA 2002.

[12]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[13]  Surajit Chaudhuri,et al.  Estimating progress of execution for SQL queries , 2004, SIGMOD '04.

[14]  Jeffrey F. Naughton,et al.  Toward a progress indicator for database queries , 2004, SIGMOD '04.

[15]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2006 .

[16]  J. Chase,et al.  Data Center Workload Monitoring , Analysis , and Emulation , 2005 .

[17]  Surajit Chaudhuri,et al.  When can we trust progress estimators for SQL queries? , 2005, SIGMOD '05.

[18]  Peter J. Haas,et al.  Statistical Learning Techniques for Costing XML Queries , 2005, VLDB.

[19]  Jeffrey F. Naughton,et al.  Increasing the accuracy and coverage of SQL progress indicators , 2005, 21st International Conference on Data Engineering (ICDE'05).

[20]  Hsien-Hsin S. Lee,et al.  Constructing a Non-Linear Model with Neural Networks for Workload Characterization , 2006, 2006 IEEE International Symposium on Workload Characterization.

[21]  Philip S. Yu,et al.  Multi-query SQL Progress Indicators , 2006, EDBT.

[22]  Raghunath Othayoth Nambiar,et al.  The making of TPC-DS , 2006, VLDB.

[23]  Chetan Gupta,et al.  PQR: Predicting Query Execution Times for Autonomous Workload Management , 2008, 2008 International Conference on Autonomic Computing.