A Memory-Balanced Linear Pipeline Architecture for Trie-based IP Lookup

Rapid growth in network link rates poses a strong demand on high speed IP lookup engines. Trie-based architectures are natural candidates for pipelined implementation to provide high throughput. However, simply mapping a trie level onto a pipeline stage results in unbalanced memory distribution over different stages. To address this problem, several novel pipelined architectures have been proposed. But their non-linear pipeline structure results in some new performance issues such as throughput degradation and delay variation. In this paper, we propose a simple and effective linear pipeline architecture for trie-based IP lookup. Our architecture achieves evenly distributed memory while realizing high throughput of one lookup per clock cycle. It offers more freedom in mapping trie nodes to pipeline stages by supporting nops. We implement our design as well as the state-of-the-art solutions on a commodity FPGA and evaluate their performance. Post place and route results show that our design can achieve a throughput of 80 Gbps, up to twice the throughput of reference solutions. It has constant delay, maintains input order, and supports incremental route updates without disrupting the ongoing IP lookup operations.

[1]  Pete Wyckoff,et al.  iWarp protocol kernel space software implementation , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[2]  Pete Wyckoff,et al.  A Performance Analysis of the Ammasso RDMA Enabled Ethernet Adapter and its iWARP API , 2005, 2005 IEEE International Conference on Cluster Computing.

[3]  David E. Culler,et al.  An Implementation and Analysis of the Virtual Interface Architecture , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[4]  Walid Dabbous,et al.  Survey and taxonomy of IP address lookup algorithms , 2001, IEEE Netw..

[5]  Pete Wyckoff,et al.  Initial Performance Evaluation of the NetEffect 10 Gigabit iWARP Adapter , 2006, 2006 IEEE International Conference on Cluster Computing.

[6]  Sartaj Sahni,et al.  Packet Forwarding Using Pipelined Multibit Tries , 2006, 11th IEEE Symposium on Computers and Communications (ISCC'06).

[7]  Bernhard Plattner,et al.  Scalable high speed IP routing lookups , 1997, SIGCOMM '97.

[8]  Patrick Crowley,et al.  CAMP: fast and efficient IP lookup architecture , 2006, ANCS '06.

[9]  Sang-Hwa Chung,et al.  Design and Implementation of an Improved Zero-Copy File Transfer Mechanism , 2004, PDCAT.

[10]  Dean M. Tullsen,et al.  A Tree Based Router Search Engine Architecture with Single Port Memories , 2005, ISCA 2005.

[11]  George Varghese,et al.  Tree bitmap: hardware/software IP lookups with incremental updates , 2004, CCRV.

[12]  Svante Carlsson,et al.  Small forwarding tables for fast routing lookups , 1997, SIGCOMM '97.

[13]  Gunnar Karlsson,et al.  IP-address lookup using LC-tries , 1999, IEEE J. Sel. Areas Commun..

[14]  David E. Taylor,et al.  Longest prefix matching using bloom filters , 2006, TNET.

[15]  Sartaj Sahni,et al.  Efficient Construction of Pipelined Multibit-Trie Router-Tables , 2007, IEEE Transactions on Computers.

[16]  Jiesheng Wu,et al.  Memory registration caching correctness , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[17]  Masoud Sabaei,et al.  A novel reconfigurable hardware architecture for IP address lookup , 2005, 2005 Symposium on Architectures for Networking and Communications Systems (ANCS).

[18]  Sartaj Sahni,et al.  Efficient construction of multibit tries for IP lookup , 2003, TNET.

[19]  Girija J. Narlikar,et al.  Fast incremental updates for pipelined forwarding engines , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[20]  George Varghese,et al.  Memory-efficient state lookups with fast updates , 2000, SIGCOMM 2000.

[21]  Wolfgang Rehm,et al.  Improving Communication Performance on InfiniBand by Using Efficient Data Placement Strategies , 2006, 2006 IEEE International Conference on Cluster Computing.

[22]  V. Srinivasan,et al.  Fast address lookups using controlled prefix expansion , 1999, TOCS.

[23]  Dhabaleswar K. Panda,et al.  Host-assisted zero-copy remote memory access communication on InfiniBand , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..