High throughput and programmable online trafficclassifier on FPGA

Machine learning (ML) algorithms have been shown to be effective in classifying the dynamic internet traffic today. Using additional features and sophisticated ML techniques can improve accuracy and can classify a broad range of application classes. Realizing such classifiers to meet high data rates is challenging. In this paper, we propose two architectures to realize complete online traffic classifier using flow-level features. First, we develop a traffic classifier based on C4.5 decision tree algorithm and Entropy-MDL discretization algorithm. It achieves an accuracy of 97.92% when classifying a traffic trace consisting of eight application classes. Next, we accelerate our classifier using two architectures on FPGA. One architecture stores the classifier in on-chip distributed RAM. It is designed to sustain a high throughput. The other architecture stores the classifier in block RAM. It is designed to operate with small hardware footprint and thus built at low hardware cost. Experimental results show that our high throughput architecture can sustain a throughput of $550$ Gbps assuming 40 Byte packet size. Our low cost architecture demonstrates a 22% better resource efficiency than the high throughput design. It can be easily replicated to achieve $449$ Gbps while supporting 160 input traffic streams concurrently. Both architectures are parameterizable and programmable to support any binary-tree-based traffic classifier. We develop a tool which allows users to easily map a binary-tree-based classifier to hardware. The tool takes a classifier as input and automatically generates the Verilog code for the corresponding hardware architecture.

[1]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[2]  Dario Rossi,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM '07.

[3]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[4]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[5]  Maya Gokhale,et al.  Real-Time Classification of Multimedia Traffic Using FPGA , 2010, 2010 International Conference on Field Programmable Logic and Applications.

[6]  Qutaibah M. Malluhi,et al.  Advances in Intelligent Systems and Computing , 2015 .

[7]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[8]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[9]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[10]  Riyad Alshammari,et al.  Machine learning based encrypted traffic classification: Identifying SSH and Skype , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[11]  Muhammad N. Marsono,et al.  Parameterizable Decision Tree Classifier on NetFPGA , 2012, ISI.

[12]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[13]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[14]  A. Nur Zincir-Heywood,et al.  A Preliminary Investigation of Skype Traffic Classification Using a Minimalist Feature Set , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[15]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[16]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[17]  Yanghee Choi,et al.  Internet traffic classification demystified: on the sources of the discriminative power , 2010, CoNEXT.

[18]  Oliver Chiu-sing Choy,et al.  Architecture and Design Flow for a Highly Efficient Structured ASIC , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[19]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[20]  Yan Luo,et al.  Acceleration of decision tree searching for IP traffic classification , 2008, ANCS '08.

[21]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.