Early Active Learning via Robust Representation and Structured Sparsity

Labeling training data is quite time-consuming but essential for supervised learning models. To solve this problem, the active learning has been studied and applied to select the informative and representative data points for labeling. However, during the early stage of experiments, only a small number (or none) of labeled data points exist, thus the most representative samples should be selected first. In this paper, we propose a novel robust active learning method to handle the early stage experimental design problem and select the most representative data points. Selecting the representative samples is an NP-hard problem, thus we employ the structured sparsity-inducing norm to relax the objective to an efficient convex formulation. Meanwhile, the robust sparse representation loss function is utilized to reduce the effect of outliers. A new efficient optimization algorithm is introduced to solve our non-smooth objective with low computational cost and proved global convergence. Empirical results on both single-label and multi-label classification benchmark data sets show the promising results of our method.

[1]  Yihong Gong,et al.  trNon-greedy active learning for text categorization using convex ansductive experimental design , 2008, SIGIR '08.

[2]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[4]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[5]  Chris H. Q. Ding,et al.  Multi-label Linear Discriminant Analysis , 2010, ECCV.

[6]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[7]  Chris H. Q. Ding,et al.  Image annotation using multi-label correlated Green's function , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  M. Haack Part B , 1942 .

[9]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[10]  Feiping Nie,et al.  A general kernelization framework for learning algorithms based on kernel PCA , 2010, Neurocomputing.

[11]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[12]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[13]  Jinbo Bi,et al.  Active learning via transductive experimental design , 2006, ICML.

[14]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[15]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[16]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[18]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[19]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[20]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[21]  Xuelong Li,et al.  Initialization Independent Clustering With Actively Self-Training Method , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[23]  Michael I. Jordan,et al.  Robust design of biological experiments , 2005, NIPS.

[24]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.