On-line Variance Minimization in O(n2) per Trial?

Consider the following canonical online learning problem with matrices [WK06]: In each trial t the algorithm chooses a density matrix Wt ∈ Rn×n (i.e., a positive semi-definite matrix with trace one). Then nature chooses a symmetric loss matrix Lt ∈ Rn×n whose eigenvalues lie in the interval [0, 1] and the algorithms incurs loss tr(WtLt). The goal is to find algorithms that for any sequence of trials have small regret against the best dyad chosen in hindsight. Here a dyad is an outer product uu# of a unit vector u in Rn. More precisely the regret after T trials is defined as follows: