Reinforcement Learning-Based Load Shared Sequential Routing

We consider event dependent routing algorithms for on-line explicit source routing in MPLS networks. The proposed methods are based on load shared sequential routing in which load sharing factors are updated using learning algorithms. The learning algorithms we employ are either based on learning automata or on online learning algorithms that were originally devised for solving the adversarial multi-armed bandit problem. While simple to implement, the performance of the proposed learning algorithms in terms of blocking probability compares favorably with the performance of other event dependent routing methods proposed for MPLS routing such as the Success-to-the-top algorithm. We demonstrate the convergence of one of the learning algorithms to the user equilibrium within a set of discrete event simulations.

[1]  H. Jonathan Chao,et al.  Multiprotocol Label Switching , 2002 .

[2]  Eric C. Rosen,et al.  Multiprotocol Label Switching Architecture , 2001, RFC.

[3]  Ramya Devi Sundaram Multiprotocol Label Switching , 2003 .

[4]  Ram Dantu,et al.  Constraint-Based LSP Setup using LDP , 2002, RFC.

[5]  Wai Sum Lai,et al.  Requirements for Support of Differentiated Services-aware MPLS Traffic Engineering , 2003, RFC.

[6]  Koushik Kar,et al.  Minimum interference routing of bandwidth guaranteed tunnels with MPLS traffic engineering applications , 2000, IEEE Journal on Selected Areas in Communications.

[7]  Yang Qin,et al.  Study on a joint multiple layer restoration scheme for IP over WDM networks , 2003 .

[8]  D. O. Awduche,et al.  MPLS and traffic engineering in IP networks , 1999, IEEE Commun. Mag..

[9]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[10]  Subhash Suri,et al.  Profile-based routing and traffic engineering , 2003, Comput. Commun..

[11]  S. Lakshmivarahan,et al.  Learning Algorithms Theory and Applications , 1981 .

[12]  Marco Conti,et al.  NETWORKING 2002, Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; and Mobile and Wireless Communications, Second International IFIP-TC6 Networking Conference, Pisa, Italy, May 19-24, 2002, Proceedings , 2002, NETWORKING.

[13]  Kumpati S. Narendra,et al.  Application of Learning Automata to Telephone Traffic Routing and Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[14]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[15]  Vijay Srinivasan,et al.  RSVP-TE: Extensions to RSVP for LSP Tunnels , 2001, RFC.

[16]  G. Ash Traffic Engineering and QoS Optimization of Integrated Voice & Data Networks , 2006 .

[17]  Alain Haurie,et al.  On the relationship between Nash - Cournot and Wardrop equilibria , 1983, Networks.

[18]  Gerald R. Ash,et al.  Performance evaluation of QoS-routing methods for IP-based multiservice networks , 2003, Comput. Commun..

[19]  L. G. Mason,et al.  Equilibrium flows, routing patterns and algorithms for store- and -forward networks , 1985 .

[20]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[21]  Anwar Alyatama Dynamic routing and wavelength assignment using learning automata technique [all optical networks] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[22]  Lorne Mason,et al.  Load Shared Sequential Routing in MPLS Networks: System and User Optimal Solutions , 2007, NET-COOP.

[23]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[24]  András György,et al.  Adaptive Routing Using Expert Advice , 2006, Comput. J..

[25]  J. G. Wardrop,et al.  Some Theoretical Aspects of Road Traffic Research , 1952 .

[26]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[27]  J. Wardrop ROAD PAPER. SOME THEORETICAL ASPECTS OF ROAD TRAFFIC RESEARCH. , 1952 .

[28]  Raouf Boutaba,et al.  Dynamic Online Routing Algorithm for MPLS Traffic Engineering , 2002, NETWORKING.