Cluster Ensembles for Network Anomaly Detection

Cluster ensembles aim to find better, more natural clusterings by combining multiple clusterings. We apply ensemble clustering to anomaly detection, hypothesizing that multiple views of the data will improve the detection of attacks. Each clustering rates how anomalous a point is; ratings are combined by averaging or taking either the minimum, the maximum, or median score. The evaluation shows that taking the median prediction from the cluster ensemble results in better performance than single clusterings. Surprisingly, averaging the individual predictions a) leads to worse performance than that of individual clusterings, and b) performs identically to taking the minimum prediction from the ensemble. This counter-intuitive result stems from asymmetric prediction distributions.

[1]  Philip K. Chan,et al.  Learning rules for anomaly detection of hostile network traffic , 2003, Third IEEE International Conference on Data Mining.

[2]  Fabio A. González,et al.  An immunity-based technique to characterize intrusions in computer networks , 2002, IEEE Trans. Evol. Comput..

[3]  Ian Witten,et al.  Data Mining , 2000 .

[4]  Anup K. Ghosh,et al.  A Study in Using Neural Networks for Anomaly and Misuse Detection , 1999, USENIX Security Symposium.

[5]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[6]  Andrew W. Moore,et al.  Accelerating exact k-means algorithms with geometric reasoning , 1999, KDD '99.

[7]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[8]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[9]  Matthew V. Mahoney,et al.  Network traffic anomaly detection based on packet bytes , 2003, SAC '03.

[10]  Philip K. Chan,et al.  PHAD: packet header anomaly detection for identifying hostile network traffic , 2001 .

[11]  F. Leisch Bagged Clustering , 1999 .

[12]  Sushil Jajodia,et al.  Detecting Novel Network Intrusions Using Bayes Estimators , 2001, SDM.

[13]  Philip K. Chan,et al.  A Machine Learning Approach to Anomaly Detection , 2003 .

[14]  R. Sekar,et al.  Specification-based anomaly detection: a new approach for detecting network intrusions , 2002, CCS '02.

[15]  Ana L. N. Fred,et al.  Analysis of consensus partition in cluster ensemble , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[16]  Mohammed J. Zaki,et al.  ADMIT: anomaly-based data mining for intrusions , 2002, KDD.

[17]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Ana L. N. Fred,et al.  Data clustering using evidence accumulation , 2002, Object recognition supported by user interaction for service robots.

[20]  Ludmila I. Kuncheva,et al.  Moderate diversity for better cluster ensembles , 2006, Inf. Fusion.

[21]  Salvatore J. Stolfo,et al.  Real time data mining-based intrusion detection , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[22]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[23]  Mohamed S. Kamel,et al.  Finding Natural Clusters Using Multi-clusterer Combiner Based on Shared Nearest Neighbors , 2003, Multiple Classifier Systems.