Machine-learnt versus analytical models of TCP throughput

We first study the accuracy of two well-known analytical models of the average throughput of long-term TCP flows, namely the so-called SQRT and PFTK models, and show that these models are far from being accurate in general. Our simulations, based on a large set of long-term TCP sessions, show that 70% of their predictions exceed the boundaries of TCP-Friendliness, thus questioning their use in the design of new TCP-Friendly transport protocols. We then investigate the reasons of this inaccuracy, and show that it is largely due to the lack of discrimination between the two packet loss detection methods used by TCP, namely by triple duplicate acknowledgements or by timeout expirations. We then apply various machine learning techniques to infer new models of the average TCP throughput. We show that they are more accurate than the SQRT and PFTK models, even without the above discrimination, and are further improved when we allow the machine-learnt models to distinguish the two loss detection techniques. Although our models are not analytical formulas, they can be plugged in transport protocols to make them TCP-Friendly. Our results also suggest that analytical models of the TCP throughput should certainly benefit from the incorporation of the timeout loss rate.

[1]  C. Barakat TCP/IP modeling and validation , 2001 .

[2]  Marco Ajmone Marsan,et al.  Closed queueing network models of interacting long-lived TCP flows , 2004, IEEE/ACM Transactions on Networking.

[3]  Vishal Misra,et al.  Fluid-based analysis of a network of AQM routers supporting TCP flows with an application to RED , 2000, SIGCOMM.

[4]  Jon Crowcroft,et al.  TCP-like congestion control for layered multicast data transfer , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[5]  Chadi Barakat,et al.  A stochastic model of TCP/IP with stationary random losses , 2000, TNET.

[6]  Eitan Altman,et al.  A stochastic model of TCP/IP with stationary random losses , 2005, IEEE/ACM Transactions on Networking.

[7]  Vern Paxson,et al.  TCP Congestion Control , 1999, RFC.

[8]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[9]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[10]  Sally Floyd,et al.  Promoting the use of end-to-end congestion control in the Internet , 1999, TNET.

[11]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[12]  Eitan Altman,et al.  A stochastic Model of TCP/IP with Stationary Ergodic Random Losses , 1999 .

[13]  O. Ait-Hellal,et al.  Cycle-based TCP-friendly algorithm , 1999, Seamless Interconnection for Universal Services. Global Telecommunications Conference. GLOBECOM'99. (Cat. No.99CH37042).

[14]  Donald F. Towsley,et al.  Modeling TCP Reno performance: a simple model and its empirical validation , 2000, TNET.

[15]  François Baccelli,et al.  AIMD, fairness and fractal scaling of TCP traffic , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[16]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[17]  Chase Cotton,et al.  Packet-level traffic measurements from the Sprint IP backbone , 2003, IEEE Netw..

[18]  Archan Misra,et al.  The window distribution of idealized TCP congestion avoidance with variable packet loss , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[19]  Adam Wolisz,et al.  MLDA: a TCP-friendly congestion control framework for heterogeneous multicast environments , 2000, 2000 Eighth International Workshop on Quality of Service. IWQoS 2000 (Cat. No.00EX400).

[20]  Biplab Sikdar,et al.  Analytic models for the latency and steady-state throughput of TCP tahoe, Reno, and SACK , 2003, TNET.

[21]  Anurag Kumar,et al.  Comparative performance analysis of versions of TCP in a local network with a lossy link , 1998, TNET.

[22]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[23]  Eitan Altman,et al.  TCP in presence of bursty losses , 2000, SIGMETRICS '00.

[24]  Martin Mauve,et al.  A survey on TCP-friendly congestion control , 2001, IEEE Netw..

[25]  Pierre Geurts,et al.  Improving TCP in Wireless Networks with an Adaptive Machine-Learnt Classifier of Packet Loss Causes , 2005, NETWORKING.

[26]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[27]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[28]  Pierre Geurts,et al.  On the Accuracy of Analytical Models of TCP Throughput , 2006, Networking.

[29]  Donald F. Towsley,et al.  Modeling TCP throughput: a simple model and its empirical validation , 1998, SIGCOMM '98.

[30]  Matthew Mathis,et al.  The macroscopic behavior of the TCP congestion avoidance algorithm , 1997, CCRV.

[31]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[32]  Mark Handley,et al.  Equation-based congestion control for unicast applications , 2000, SIGCOMM.

[33]  Jörg Widmer,et al.  Extending equation-based congestion control to multicast applications , 2001, SIGCOMM '01.