Detecting Triangle Inequality Violations in Internet Coordinate Systems by Supervised Learning

Internet Coordinates Systems (ICS) are used to predict Internet distances with limited measurements. However the precision of an ICS is degraded by the presence of Triangle Inequality Violations (TIVs). Simple methods have been proposed to detect TIVs, based e.g. on the empirical observation that a TIV is more likely when the distance is underestimated by the coordinates. In this paper, we apply supervised machine learning techniques to try and derive more powerful criteria to detect TIVs. We first show that (ensembles of) Decision Trees (DTs) learnt on our datasets are very good models for this problem. Moreover, our approach brings out a discriminative variable (called OREE ), which combines the classical estimation error with the variance of the estimated distance. This variable alone is as good as an ensemble of DTs, and provides a much simpler criterion. If every node of the ICS sorts its neighbours according to OREE , we show that cutting these lists after a given number of neighbours, or when OREE crosses a given threshold value, achieves very good performance to detect TIVs.

[1]  Michael L. Littman,et al.  Reinforcement learning for autonomic network repair , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[2]  Eng Keong Lua,et al.  Internet Routing Policies and Round-Trip-Times , 2005, PAM.

[3]  Mohamed Ali Kâafar,et al.  Towards a Two-Tier Internet Coordinate System to Mitigate the Impact of Triangle Inequality Violations , 2008, Networking.

[4]  Zhi-Li Zhang,et al.  On suitability of Euclidean embedding of internet hosts , 2006, SIGMETRICS '06/Performance '06.

[5]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[6]  Bo Zhang,et al.  Towards network triangle inequality violation aware distributed systems , 2007, IMC '07.

[7]  Sonia Fahmy,et al.  Impact of the Inaccuracy of Distance Prediction Algorithms on Internet Applications - an Analytical and Comparative Study , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[8]  Miguel Castro,et al.  PIC: practical Internet coordinates for distance estimation , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[9]  Mark Handley,et al.  Topologically-aware overlay construction and server selection , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[10]  Emin Gün Sirer,et al.  Meridian: a lightweight network location service without virtual coordinates , 2005, SIGCOMM '05.

[11]  Pierre Geurts,et al.  Machine-learnt versus analytical models of TCP throughput , 2007, Comput. Networks.

[12]  Amitabha Das NETWORKING 2008, Ad Hoc and Sensor Networks, Wireless Networks, Next Generation Internet , 7th International IFIP-TC6 Networking Conference, Singapore, May 5-9, 2008, Proceedings , 2008, Networking.

[13]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[14]  F. Cantin,et al.  Detecting Triangle Inequality Violations for Internet Coordinate Systems , 2009, 2009 IEEE International Conference on Communications Workshops.

[15]  Robert Tappan Morris,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM '04.

[16]  Hui Zhang,et al.  A Network Positioning System for the Internet , 2004, USENIX Annual Technical Conference, General Track.