Competitive ratio versus regret minimization: achieving the best of both worlds

We consider online algorithms under both the competitive ratio criteria and the regret minimization one. Our main goal is to build a unified methodology that would be able to guarantee both criteria simultaneously. For a general class of online algorithms, namely any Metrical Task System (MTS), we show that one can simultaneously guarantee the best known competitive ratio and a natural regret bound. For the paging problem we further show an efficient online algorithm (polynomial in the number of pages) with this guarantee. To this end, we extend an existing regret minimization algorithm (specifically, Kapralov and Panigrahy) to handle movement cost (the cost of switching between states of the online system). We then show how to use the extended regret minimization algorithm to combine multiple online algorithms. Our end result is an online algorithm that can combine a "base" online algorithm, having a guaranteed competitive ratio, with a range of online algorithms that guarantee a small regret over any interval of time. The combined algorithm guarantees both that the competitive ratio matches that of the base algorithm and a low regret over any time interval. As a by product, we obtain an expert algorithm with close to optimal regret bound on every time interval, even in the presence of switching costs. This result is of independent interest.

[1]  Preyas Popat,et al.  Optimal amortized regret in every interval , 2013, UAI.

[2]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[3]  Adam Tauman Kalai,et al.  Static Optimality and Dynamic Search-Optimality in Lists and Trees , 2002, SODA '02.

[4]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[5]  James R. Lee,et al.  k-server via multiscale entropic regularization , 2017, STOC.

[6]  Amos Fiat,et al.  Competitive Paging Algorithms , 1991, J. Algorithms.

[7]  Shahin Shahrampour,et al.  Online Optimization : Competing with Dynamic Comparators , 2015, AISTATS.

[8]  Amit Daniely,et al.  Strongly Adaptive Online Learning , 2015, ICML.

[9]  Seshadhri Comandur,et al.  Adaptive Algorithms for Online Decision Problems , 2007, Electron. Colloquium Comput. Complex..

[10]  Amos Fiat,et al.  Better algorithms for unfair metrical task systems and applications , 2000, STOC '00.

[11]  Elad Hazan,et al.  The computational power of optimization in online learning , 2015, STOC.

[12]  Béla Bollobás,et al.  Ramsey-type theorems for metric spaces with applications to online problems , 2004, J. Comput. Syst. Sci..

[13]  Amin Saberi,et al.  A new greedy approach for facility location problems , 2002, STOC '02.

[14]  Yuval Peres,et al.  Bandits with switching costs: T2/3 regret , 2013, STOC.

[15]  Joseph Naor,et al.  Unified Algorithms for Online Learning and Competitive Analysis , 2012, COLT.

[16]  Berthold Vöcking,et al.  Regret Minimization for Online Buffering Problems Using the Weighted Majority Algorithm , 2010, Electron. Colloquium Comput. Complex..

[17]  Peter L. Bartlett,et al.  A Regularization Approach to Metrical Task Systems , 2010, ALT.

[18]  Christos H. Papadimitriou,et al.  On the k-server conjecture , 1995, JACM.

[19]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine-mediated learning.

[20]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[21]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[22]  Allan Borodin,et al.  An optimal on-line algorithm for metrical task system , 1992, JACM.

[23]  Lachlan L. H. Andrew,et al.  A tale of two metrics: simultaneous bounds on competitiveness and regret , 2013, SIGMETRICS '13.

[24]  Rebecca Willett,et al.  Online Optimization in Dynamic Environments , 2013, ArXiv.

[25]  Joseph Naor,et al.  A Polylogarithmic-Competitive Algorithm for the k-Server Problem , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[26]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[27]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[28]  Yishay Mansour,et al.  From External to Internal Regret , 2005, J. Mach. Learn. Res..

[29]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[30]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[31]  Nicolò Cesa-Bianchi,et al.  Online Learning with Switching Costs and Other Adaptive Adversaries , 2013, NIPS.

[32]  Nathan Linial,et al.  On metric Ramsey-type phenomena , 2004 .

[33]  Wouter M. Koolen,et al.  A Closer Look at Adaptive Regret , 2012, J. Mach. Learn. Res..

[34]  Aravind Srinivasan,et al.  An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization , 2014, SODA.

[35]  Avrim Blum,et al.  On-line Learning and the Metrical Task System Problem , 1997, COLT '97.

[36]  Rina Panigrahy,et al.  Prediction strategies without loss , 2010, NIPS.

[37]  Andrew Tomkins,et al.  A polylog(n)-competitive algorithm for metrical task systems , 1997, STOC '97.

[38]  Nicolò Cesa-Bianchi,et al.  A new look at shifting regret , 2012, ArXiv.

[39]  Haipeng Luo,et al.  Achieving All with No Parameters: Adaptive NormalHedge , 2015, ArXiv.

[40]  Edward G. Coffman,et al.  Probabilistic analysis of packing and partitioning algorithms , 1991, Wiley-Interscience series in discrete mathematics and optimization.

[41]  Kunal Talwar,et al.  Online learning over a finite action set with limited switching , 2018, COLT.