Combining Initial Segments of Lists

We propose a new way to build a combined list from K base lists, each containing N items. A combined list consists of top segments of various sizes from each base list so that the total size of all top segments equals N. A sequence of item requests is processed and the goal is to minimize the total number of misses. That is, we seek to build a combined list that contains all the frequently requested items. We first consider the special case of disjoint base lists. There, we design an efficient algorithm that computes the best combined list for a given sequence of requests. In addition, we develop a randomized online algorithm whose expected number of misses is close to that of the best combined list chosen in hindsight. We prove lower bounds that show that the expected number of misses of our randomized algorithm is close to the optimum. In the presence of duplicate items, we show that computing the best combined list is NP-hard. We show that our algorithms still apply to a linearized notion of loss in this case. We expect that this new way of aggregating lists will find many ranking applications.

[1]  Timothy M. Chan,et al.  Necklaces, Convolutions, and X + Y , 2006, ESA.

[2]  Manfred K. Warmuth,et al.  Path kernels and multiplicative updates , 2003 .

[3]  Nimrod Megiddo,et al.  Outperforming LRU with an adaptive replacement cache algorithm , 2004, Computer.

[4]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[5]  Tamás Linder,et al.  Tracking the Best of Many Experts , 2005, COLT.

[6]  Timothy M. Chan,et al.  Necklaces, Convolutions, and X+Y , 2006, Algorithmica.

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[8]  Berthold Vöcking,et al.  Regret Minimization for Online Buffering Problems Using the Weighted Majority Algorithm , 2010, Electron. Colloquium Comput. Complex..

[9]  Nimrod Megiddo,et al.  One Up on LRU , 2003, login Usenix Mag..

[10]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[11]  Scott A. Brandt,et al.  Adaptive Caching by Refetching , 2002, NIPS.

[12]  Jacob Abernethy,et al.  Optimal strategies from random walks , 2008, COLT 2008.

[13]  Petri Myllymäki,et al.  A Fast Normalized Maximum Likelihood Algorithm for Multinomial Data , 2005, IJCAI.

[14]  Darrell D. E. Long,et al.  Adaptive disk spin‐down for mobile computers , 2000, Mob. Networks Appl..

[15]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[16]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[17]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[18]  Marcus Hutter,et al.  Adaptive Online Prediction by Following the Perturbed Leader , 2005, J. Mach. Learn. Res..

[19]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[20]  Elad Hazan,et al.  Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.

[21]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[22]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..