A new TCB cache to efficiently manage TCP sessions for web servers

TCP/IP, the most commonly used network protocol, consumes a significant portion of time in Internet servers. While a wide spectrum of studies has been done to reduce its processing overhead such as TOE and Direct Cache Access, most of them did studies solely from the per-packet perspective and concentrated on the packet memory access overhead. They ignored per-session data TCP Control Block (TCB), which poses a challenge in web servers with a large volume of concurrent sessions. In this paper, we start with challenge studies and show that the TCB data should be efficiently managed. We propose a new TCB cache addressed by session identifiers to address the challenge. We carefully design the TCB cache along two important axes: cache indexing and cache replacement policies. First, we study the performance of various hash functions and propose a new indexing scheme for the TCB cache by employing two Universal hash functions. We analyze session identifiers and choose some important bits as indexing bits to reduce hashing hardware complexity. Second, by leveraging characteristics of web sessions, we design a speculative cache replacement policy, which can effectively work on the TCB cache with two cache banks. Experimental results show that the new cache efficiently manages the per-session data. When it is used in TOEs or integrated into CPUs to manage the per-session data, TCP/IP processing time is significantly reduced, thus saving web server response time.

[1]  Scott Rixner,et al.  Performance Characterization of the FreeBSD Network Stack , 2005 .

[2]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .

[4]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[5]  Mateo Valero,et al.  Eliminating cache conflict misses through XOR-based placement functions , 1997, ICS '97.

[6]  Srihari Makineni,et al.  Characterization of Direct Cache Access on multi-core systems and 10GbE , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[7]  Srihari Makineni,et al.  Architectural characterization of TCP/IP packet processing on the Pentium/spl reg/ M microprocessor , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[8]  Ram Huggahalli,et al.  Impact of Cache Coherence Protocols on the Processing of Network Traffic , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[9]  Mark Horowitz,et al.  Cache performance of operating system and multiprogramming workloads , 1988, TOCS.

[10]  M. V. Ramakrishna,et al.  Efficient Hardware Hashing Functions for High Performance Computers , 1997, IEEE Trans. Computers.

[11]  Muli Ben-Yehuda,et al.  Loosely Coupled TCP Acceleration Architecture , 2006, 14th IEEE Symposium on High-Performance Interconnects (HOTI'06).

[12]  André Seznec,et al.  A case for two-way skewed-associative caches , 1993, ISCA '93.

[13]  Ali G. Saidi,et al.  Integrated network interfaces for high-bandwidth TCP/IP , 2006, ASPLOS XII.

[14]  Li Zhao,et al.  TCP/IP Cache Characterization in Commercial Server Workloads , 2004 .

[15]  André Seznec A New Case for Skewed-Associativity , 1997 .

[16]  Shubhendu S. Mukherjee,et al.  Coherent Network Interfaces for Fine-Grain Communication , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[17]  Ram Huggahalli,et al.  Direct cache access for high bandwidth network I/O , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[18]  Laxmi N. Bhuyan,et al.  Performance Measurement of an Integrated NIC Architecture with 10GbE , 2009, 2009 17th IEEE Symposium on High Performance Interconnects.

[19]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[20]  Scott Rixner,et al.  Connection handoff policies for TCP offload network interfaces , 2006, OSDI '06.

[21]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[22]  Fong Pong Fast and robust TCP session lookup by digest hash , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[23]  Sriram R. Vangal,et al.  A TCP offload accelerator for 10 Gb/s Ethernet in 90-nm CMOS , 2003 .

[24]  Jaejin Lee,et al.  Using prime numbers for cache indexing to eliminate conflict misses , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[25]  Scott Rixner,et al.  TCP offload through connection handoff , 2006, EuroSys.

[26]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[27]  A. Argawal,et al.  Cache performance of operating systems and multiprogramming , 1988 .

[28]  Andy Oram,et al.  Understanding the Linux Kernel, Second Edition , 2002 .

[29]  W. W. PETERSONt,et al.  Cyclic Codes for Error Detection * , 2022 .