Non-parametric Online AUC Maximization

We consider the problems of online and one-pass maximization of the area under the ROC curve (AUC). AUC maximization is hard even in the offline setting and thus solutions often make some compromises. Existing results for the online problem typically optimize for some proxy defined via surrogate losses instead of maximizing the real AUC. This approach is confirmed by results showing that the optimum of these proxies, over the set of all (measurable) functions, maximize the AUC. The problem is that—in order to meet the strong requirements for per round run time complexity—online methods typically work with restricted hypothesis classes and this, as we show, corrupts the above compatibility and causes the methods to converge to suboptimal solutions even in some simple stochastic cases. To remedy this, we propose a different approach and show that it leads to asymptotic optimality. Our theoretical claims and considerations are tested by experiments on real datasets, which provide empirical justification to them.

[1]  Kfir Y. Levy,et al.  k*-Nearest Neighbors: From Global to Local , 2017, NIPS.

[2]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[3]  Roni Khardon,et al.  Generalization Bounds for Online Learning Algorithms with Pairwise Loss Functions , 2012, COLT.

[4]  Zhi-Hua Zhou,et al.  One-Pass AUC Optimization , 2013, ICML.

[5]  Sanjoy Dasgupta,et al.  Rates of Convergence for Nearest Neighbor Classification , 2014, NIPS.

[6]  Zhi-Hua Zhou,et al.  On the Consistency of AUC Pairwise Optimization , 2012, IJCAI.

[7]  Anonymous Author Robust Reductions from Ranking to Classification , 2006 .

[8]  Stéphan Clémençon,et al.  Tree-Based Ranking Methods , 2009, IEEE Transactions on Information Theory.

[9]  Stéphan Clémençon,et al.  Minimax Learning Rates for Bipartite Ranking and Plug-in Rules , 2011, ICML.

[10]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[11]  Prateek Jain,et al.  On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions , 2013, ICML.

[12]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[13]  Shivani Agarwal,et al.  Surrogate regret bounds for bipartite ranking via strongly proper losses , 2012, J. Mach. Learn. Res..

[14]  Eyke Hüllermeier,et al.  Bipartite Ranking through Minimization of Univariate Loss , 2011, ICML.

[15]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[16]  Rong Jin,et al.  Online AUC Maximization , 2011, ICML.

[17]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[18]  Mehryar Mohri,et al.  An Efficient Reduction of Ranking to Classification , 2007, COLT.

[19]  Yi Ding,et al.  An Adaptive Gradient Method for Online AUC Maximization , 2015, AAAI.

[20]  David Degras,et al.  Online Principal Component Analysis in High Dimension: Which Algorithm to Choose? , 2015, ArXiv.

[21]  Kazuki Uematsu,et al.  On Theoretically Optimal Ranking Functions in Bipartite Ranking , 2017 .

[22]  G. Lugosi,et al.  On the Strong Universal Consistency of Nearest Neighbor Regression Function Estimates , 1994 .