High Throughput Hierarchical Heavy Hitter Detection in Data Streams

Detecting heavy activity aggregation in data streams is a critical task for many networking, data base and data-mining applications. The aggregation points often belong to hierarchical domains (e.g. IP domain, XML data tree, etc.). These aggregation points are referred to as hierarchical heavy hitters. The hierarchical domains is usually very large with respect to both the number of aggregation points and the number of levels in the hierarchy. Due to the huge amount of intermediate data to be maintained and the dependency between hierarchy levels, it is challenging to detect hierarchical heavy hitters in very large hierarchical domains at high throughput. In this work, we propose a novel Sketch-counter-hybrid algorithm that decomposes the online hierarchical heavy hitter detection to independent and parallel online heavy hitter detection at each hierarchy level. We then propose a fully pipelined architecture suitable for FPGA implementation to accelerate the algorithm. We use IP network monitoring as an example of application. The post place-and-route results on a state-of-the-art FPGA show high throughput and scalability. Our architecture achieves 123 Gbps throughput assuming minimum-sized IPv4 packets while supporting the entire 32-bit IPv4 hierarchy on a single FPGA device. It sustains 100+ Gbps throughput while supporting various hierarchy sizes, stream sizes and accuracy requirements. Our architecture achieves much higher throughput than other techniques when accelerating Sketches of similar number of rows and columns, and counter sizes.

[1]  Viktor K. Prasanna,et al.  Online heavy hitter detector on FPGA , 2013, 2013 International Conference on Reconfigurable Computing and FPGAs (ReConFig).

[2]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[3]  Theophilus Wellem,et al.  Accelerating Sketch-Based Computations with GPU: A Case Study for Network Traffic Change Detection , 2011, 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems.

[4]  Carsten Lund,et al.  Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications , 2004, IMC '04.

[5]  M. V. Ramakrishna,et al.  Efficient Hardware Hashing Functions for High Performance Computers , 1997, IEEE Trans. Computers.

[6]  Cristian Estan,et al.  New directions in traffic measurement and accounting , 2001, IMW '01.

[7]  Viktor K. Prasanna,et al.  High Throughput Sketch Based Online Heavy Hitter Detection on FPGA , 2016, SIGARCH Comput. Archit. News.

[8]  Ming-Yang Kao,et al.  Reversible sketches: enabling monitoring and analysis over high-speed data streams , 2007, TNET.

[9]  Divesh Srivastava,et al.  Finding Hierarchical Heavy Hitters in Data Streams , 2003, VLDB.

[10]  Hargyo Tri Nugroho,et al.  Implementing On-line Sketch-Based Change Detection on a NetFPGA Platform , 2010 .

[11]  Alok N. Choudhary,et al.  Real-time feature extraction for high speed networks , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[12]  Gordon J. Brebner,et al.  400 Gb/s Programmable Packet Parsing on a Single FPGA , 2011, 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems.

[13]  George Varghese,et al.  Automatically inferring patterns of resource consumption in network traffic , 2003, SIGCOMM '03.

[14]  Gregory T. Byrd,et al.  High-throughput sketch update on a low-power stream processor , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.