Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for Failure Detection and Localization

Web applications suffer from software and configuration faults that lower their availability. Recovering from failure is dominated by the time interval between when these faults appear and when they are detected by site operators. We introduce a set of tools that augment the ability of operators to perceive the presence of failure: an automatic anomaly detector scours HTTP access logs to find changes in user behavior that are indicative of site failures, and a visualizer helps operators rapidly detect and diagnose problems. Visualization addresses a key question of autonomic computing of how to win operators' confidence so that new tools will be embraced. Evaluation performed using HTTP logs from Ebates.com demonstrates that these tools can enhance the detection of failure as well as shorten detection time. Our approach is application-generic and can be applied to any Web application without the need for instrumentation

[1]  W. J. Langford Statistical Methods , 1959, Nature.

[2]  D. Felsenthal,et al.  The weighted voting rule in the EU's Council of Ministers, 1958–1995: Intentions and outcomes , 1997 .

[3]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .

[4]  Eleazar Eskin,et al.  Anomaly Detection over Noisy Data using Learned Probability Distributions , 2000, ICML.

[5]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[6]  D. Felsenthal,et al.  Enlargement of the EU and weighted voting in its council of ministers , 2000 .

[7]  Dan S. Felsenthal,et al.  The Treaty of Nice and qualified majority voting , 2001, Soc. Choice Welf..

[8]  Jeff Tian,et al.  Measuring and Modeling Usage and Reliability for Statistical Web Testing , 2001, IEEE Trans. Software Eng..

[9]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[10]  D. Felsenthal,et al.  The voting power approach : response to a philosophical reproach , 2003 .

[11]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[12]  Jeffrey S. Chase,et al.  Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control , 2004, OSDI.

[13]  Dan S. Felsenthal,et al.  Analysis of QM rules in the draft constitution for Europe proposed by the European Convention, 2003 , 2004, Soc. Choice Welf..

[14]  David A. Patterson,et al.  Path-Based Failure and Evolution Management , 2004, NSDI.

[15]  Armando Fox,et al.  Detecting application-level failures in component-based Internet services , 2005, IEEE Transactions on Neural Networks.