论文信息 - Online Passive-Aggressive Algorithms - 字舞流文

Online Passive-Aggressive Algorithms

We present a unified view for online classification, regression, and uni-class problems. This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. A conversion of our main online algorithm to the setting of batch learning is also discussed. The end result is new algorithms and accompanying loss bounds for the hinge-loss.

Koby Crammer | Yoram Singer | Shai Shalev-Shwartz | Joseph Keshet | Ofer Dekel | Y. Singer | S. Shalev-Shwartz | K. Crammer | O. Dekel | Joseph Keshet | Shai Shalev-Shwartz

[1] I. J. Schoenberg,et al. The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[2] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[3] Albert B Novikoff,et al. ON CONVERGENCE PROOFS FOR PERCEPTRONS , 1963 .

[4] J. J. Rocchio,et al. Relevance feedback in information retrieval , 1971 .

[5] Gerard Salton,et al. The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[6] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7] N. Littlestone. Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[8] Hans Ulrich Simon,et al. From noise-free to noise-tolerant and from on-line to batch learning , 1995, COLT '95.

[9] Heinz H. Bauschke,et al. On Projection Algorithms for Solving Convex Feasibility Problems , 1996, SIAM Rev..

[10] Y. Censor,et al. Parallel Optimization: Theory, Algorithms, and Applications , 1997 .

[11] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[12] Y. Censor,et al. Parallel Optimization:theory , 1997 .

[13] Claudio Gentile,et al. Linear Hinge Loss and Average Margin , 1998, NIPS.

[14] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[15] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[16] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[17] Jason Weston,et al. Support vector machines for multi-class pattern recognition , 1999, ESANN.

[18] Manfred K. Warmuth,et al. Relative loss bounds for single neurons , 1999, IEEE Trans. Neural Networks.

[19] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[20] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[21] Claudio Gentile,et al. A New Approximate Maximal Margin Classification Algorithm , 2002, J. Mach. Learn. Res..

[22] Jason Weston,et al. A kernel method for multi-labelled classification , 2001, NIPS.

[23] Mark Herbster,et al. Learning Additive Models Online with Fast Evaluating Kernels , 2001, COLT/EuroCOLT.

[24] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[25] Koby Crammer,et al. A new family of online algorithms for category ranking , 2002, SIGIR '02.

[26] Michael I. Jordan,et al. Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[27] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .

[28] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[29] Koby Crammer,et al. Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[30] Thomas Hofmann,et al. Hidden Markov Support Vector Machines , 2003, ICML.

[31] Koby Crammer,et al. A Family of Additive Online Algorithms for Category Ranking , 2003, J. Mach. Learn. Res..

[32] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[33] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[34] Yoram Singer,et al. Large margin hierarchical classification , 2004, ICML.

[35] Claudio Gentile,et al. The Robustness of the p-Norm Algorithms , 2003, Machine Learning.

[36] Alexander J. Smola,et al. Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[37] Manfred K. Warmuth,et al. Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[38] Yi Li,et al. The Relaxed Online Maximum Margin Algorithm , 1999, Machine Learning.

[39] Thomas Hofmann,et al. Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[40] Yoram Singer,et al. Online and batch learning of pseudo-metrics , 2004, ICML.

[41] Yoram Singer,et al. Learning to Align Polyphonic Music , 2004, ISMIR.

[42] Yoram Singer,et al. A Comparison of New and Old Algorithms for a Mixture Estimation Problem , 1995, COLT '95.

[43] Yoram Singer,et al. BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[44] Yoram Singer,et al. The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees , 2004, NIPS.

[45] Michael Collins,et al. Discriminative Reranking for Natural Language Parsing , 2000, CL.