Head-body partitioned string matching for Deep Packet Inspection with scalable and attack-resilient performance

Dictionary-based string matching (DBSM) is a critical component of Deep Packet Inspection (DPI), where thousands of malicious patterns are matched against high-bandwidth network traffic. Deterministic finite automata constructed with the Aho-Corasick algorithm (AC-DFA) have been widely used for solving this problem. However, the state transition table (STT) of a large-scale DBSM AC-DFA can span hundreds of megabytes of system memory, whose limited bandwidth and long latency could become the performance bottleneck We propose a novel partitioning algorithm which converts an AC-DFA into a “head” and a “body” parts. The head part behaves as a traditional AC-DFA that matches the pattern prefixes up to a predefined length; the body part extends any head match to the full pattern length in parallel body-tree traversals. Taking advantage of the SIMD instructions in modern x86-64 multi-core processors, we design compact and efficient data structures packing multi-path and multi-stride pattern segments in the body-tree. Compared with an optimized AC-DFA solution, our head-body matching (HBM) implementation achieves 1.2x to 3x throughput performance when the input match (attack) ratio varies from 2% to 32%, respectively. Our HBM data structure is over 20x smaller than a fully-populated AC-DFA for both Snort and ClamAV dictionaries. The aggregated throughput of our HBM approach scales almost 7x with 8 threads to over 10 Gbps in a dual-socket quad-core Opteron (Shanghai) server.

[1]  Viktor K. Prasanna,et al.  Memory-Efficient Pipelined Architecture for Large-Scale String Matching , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[2]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[3]  Evangelos P. Markatos,et al.  Generating realistic workloads for network intrusion detection systems , 2004, WOSP '04.

[4]  Wei Lin,et al.  Pipelined Architecture for Multi-String Matching , 2008, IEEE Computer Architecture Letters.

[5]  Jan van Lunteren,et al.  High-Performance Pattern-Matching for Intrusion Detection , 2006, INFOCOM.

[6]  Sotiris Ioannidis,et al.  Gnort: High Performance Network Intrusion Detection Using Graphics Processors , 2008, RAID.

[7]  Fabrizio Petrini,et al.  Exact multi-pattern string matching on the cell/b.e. processor , 2008, CF '08.

[8]  Viktor K. Prasanna,et al.  Time and area efficient pattern matching on FPGAs , 2004, FPGA '04.

[9]  Fabrizio Petrini,et al.  High-speed string searching against large dictionaries on the Cell/B.E. Processor , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[10]  Wei Zhang,et al.  A Memory Efficient Multiple Pattern Matching Architecture for Network Security , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[11]  Dionisios N. Pnevmatikatos,et al.  Fast, Large-Scale String Match for a 10Gbps FPGA-Based Network Intrusion Detection System , 2003, FPL.

[12]  Somesh Jha,et al.  Deflating the big bang: fast and scalable deep packet inspection with extended finite automata , 2008, SIGCOMM '08.

[13]  Viktor K. Prasanna,et al.  Multi-Core Architecture on FPGA for Large Dictionary String Matching , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[14]  Timothy Sherwood,et al.  Architectures for Bit-Split String Scanning in Intrusion Detection , 2006, IEEE Micro.

[15]  T. V. Lakshman,et al.  Gigabit rate packet pattern-matching using TCAM , 2004, Proceedings of the 12th IEEE International Conference on Network Protocols, 2004. ICNP 2004..

[16]  Norio Yamagaki,et al.  High-speed regular expression matching engine using multi-character NFA , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[17]  Viktor K. Prasanna,et al.  Compact architecture for high-throughput regular expression matching on FPGA , 2008, ANCS '08.

[18]  Evangelos P. Markatos,et al.  Performance analysis of content matching intrusion detection systems , 2004, 2004 International Symposium on Applications and the Internet. Proceedings..