ShareBoost: Boosting for Multi-view Learning with Performance Guarantees

Algorithms combining multi-view information are known to exponentially quicken classification, and have been applied to many fields. However, they lack the ability to mine most discriminant information sources (or data types) for making predictions. In this paper, we propose an algorithm based on boosting to address these problems. The proposed algorithm builds base classifiers independently from each data type (view) that provides a partial view about an object of interest. Different from AdaBoost, where each view has its own re-sampling weight, our algorithm uses a single re-sampling distribution for all views at each boosting round. This distribution is determined by the view whose training error is minimal. This shared sampling mechanism restricts noise to individual views, thereby reducing sensitivity to noise. Furthermore, in order to establish performance guarantees, we introduce a randomized version of the algorithm, where a winning view is chosen probabilistically. As a result, it can be cast within a multi-armed bandit framework, which allows us to show that with high probability the algorithm seeks out most discriminant views of data for making predictions. We provide experimental results that show its performance against noise and competing techniques.

[1]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[2]  Markus Püschel,et al.  Bandit-based optimization on graphs with application to library performance tuning , 2009, ICML '09.

[3]  Arun Ross,et al.  Multimodal biometrics: An overview , 2004, 2004 12th European Signal Processing Conference.

[4]  Michèle Sebag,et al.  Extreme compass and Dynamic Multi-Armed Bandits for Adaptive Operator Selection , 2009, 2009 IEEE Congress on Evolutionary Computation.

[5]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[6]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 2000, Nucleic Acids Res..

[7]  B. Kégl,et al.  Fast boosting using adversarial bandits , 2010, ICML.

[8]  Cynthia Rudin,et al.  Precise Statements of Convergence for AdaBoost and arc-gv , 2007 .

[9]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[10]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[11]  Tom Fawcett,et al.  Technical Note: PAV and the ROC Convex Hull , 2007 .

[12]  Fei Wang,et al.  Multi-View Local Learning , 2008, AAAI.

[13]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[14]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[15]  Tom Fawcett,et al.  PAV and the ROC convex hull , 2007, Machine Learning.

[16]  Y. Freund,et al.  The non-stochastic multi-armed bandit problem , 2001 .

[17]  Paul A. Viola,et al.  Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade , 2001, NIPS.

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[19]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[20]  Zhi-Hua Zhou,et al.  A New Analysis of Co-Training , 2010, ICML.

[21]  Josef Kittler,et al.  Combining classifiers: A theoretical framework , 1998, Pattern Analysis and Applications.

[22]  Nello Cristianini,et al.  Kernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast , 2003, Pacific Symposium on Biocomputing.

[23]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[24]  Zhi-Hua Zhou,et al.  On multi-view active learning and the combination with semi-supervised learning , 2008, ICML '08.

[25]  Balázs Kégl,et al.  Accelerating AdaBoost using UCB , 2009, KDD Cup.

[26]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[27]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.