Consider the following setting for an on-line algorithm (introduced in [FS97]) that learns from a set of experts: In trial t the algorithm chooses an expert with probability p$^{t}_{i}$ . At the end of the trial a loss vector Lt∈[0,R]n for the n experts is received and an expected loss of ∑ip$^{t}_{i}$L$^{t}_{i}$ is incurred. A simple algorithm for this setting is the Hedge algorithm which uses the probabilities $p^{t}_{i} \sim exp^{-\eta L^{<t}_{i}}$. This algorithm and its analysis is a simple reformulation of the randomized version of the Weighted Majority algorithm (WMR) [LW94] which was designed for the absolute loss. The total expected loss of the algorithm is close to the total loss of the best expert $L_{*} = min_{i}L^{\leq T}_{i}$. That is, when the learning rate is optimally tuned based on L*, R and n, then the total expected loss of the Hedge/WMR algorithm is at most
$$L_{*} + \sqrt{\bf 2}\sqrt{L_{*}R{\rm log} n} + O({\rm log} n)$$
The factor of $\sqrt{\bf 2}$ is in some sense optimal [Vov97].
[1]
Yoav Freund,et al.
A decision-theoretic generalization of on-line learning and an application to boosting
,
1995,
EuroCOLT.
[2]
Manfred K. Warmuth,et al.
The Weighted Majority Algorithm
,
1994,
Inf. Comput..
[3]
Manfred K. Warmuth,et al.
Path Kernels and Multiplicative Updates
,
2002,
J. Mach. Learn. Res..
[4]
Vladimir Vovk,et al.
A game of prediction with expert advice
,
1995,
COLT '95.
[5]
Yoav Freund,et al.
A decision-theoretic generalization of on-line learning and an application to boosting
,
1997,
EuroCOLT.
[6]
Santosh S. Vempala,et al.
Efficient algorithms for online decision problems
,
2005,
J. Comput. Syst. Sci..
[7]
Leslie G. Valiant,et al.
The Complexity of Enumeration and Reliability Problems
,
1979,
SIAM J. Comput..