Characterizing, modeling, and generating workload spikes for stateful services

Evaluating the resiliency of stateful Internet services to significant workload spikes and data hotspots requires realistic workload traces that are usually very difficult to obtain. A popular approach is to create a workload model and generate synthetic workload, however, there exists no characterization and model of stateful spikes. In this paper we analyze five workload and data spikes and find that they vary significantly in many important aspects such as steepness, magnitude, duration, and spatial locality. We propose and validate a model of stateful spikes that allows us to synthesize volume and data spikes and could thus be used by both cloud computing users and providers to stress-test their infrastructure.

[1]  D. Aldous Exchangeability and related topics , 1985 .

[2]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[3]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[4]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[5]  Martin Arlitt,et al.  Workload Characterization of the 1998 World Cup Web Site , 1999 .

[6]  Lili Qiu,et al.  The content and access dynamics of a busy Web site: findings and implications , 2000 .

[7]  Martin Arlitt,et al.  A workload characterization study of the 1998 World Cup Web site , 2000, IEEE Netw..

[8]  G. Voelker,et al.  On the scale and performance of cooperative Web proxy caching , 2000, OPSR.

[9]  Venkata N. Padmanabhan,et al.  The content and access dynamics of a busy web site: findings and implicatins , 2000, SIGCOMM.

[10]  Balachander Krishnamurthy,et al.  Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites , 2002, WWW '02.

[11]  Xin Chen,et al.  A Popularity-Based Prediction Model for Web Prefetching , 2003, Computer.

[12]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[13]  Prashant J. Shenoy,et al.  Dynamic Provisioning of Multi-tier Internet Applications , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[14]  George Candea,et al.  Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for Failure Detection and Localization , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[15]  Mor Harchol-Balter,et al.  Web servers under overload: How scheduling can help , 2006, TOIT.

[16]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[17]  Ajay Gulati,et al.  Storage Workload Characterization and Consolidation in Virtualized Environments , 2008 .

[18]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[19]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[20]  Evgenia Smirni,et al.  Injecting realistic burstiness to a traditional client-server benchmark , 2009, ICAC '09.

[21]  David A. Patterson,et al.  SCADS: Scale-Independent Storage for Social Computing Applications , 2009, CIDR.