Semi-supervised learning for tree-structured ensembles of RBF networks with Co-Training

Supervised learning requires a large amount of labeled data, but the data labeling process can be expensive and time consuming, as it requires the efforts of human experts. Co-Training is a semi-supervised learning method that can reduce the amount of required labeled data through exploiting the available unlabeled data to improve the classification accuracy. It is assumed that the patterns are represented by two or more redundantly sufficient feature sets (views) and these views are independent given the class. On the other hand, most of the real-world pattern recognition tasks involve a large number of categories which may make the task difficult. The tree-structured approach is an output space decomposition method where a complex multi-class problem is decomposed into a set of binary sub-problems. In this paper, we propose two learning architectures to combine the merits of the tree-structured approach and Co-Training. We show that our architectures are especially useful for classification tasks that involve a large number of classes and a small amount of labeled data where the single-view tree-structured approach does not perform well alone but when combined with Co-Training, it can exploit effectively the independent views and the unlabeled data to improve the recognition accuracy.

[1]  Günther Palm,et al.  Comparison of Multiclass SVM Decomposition Schemes for Visual Object Recognition , 2005, DAGM-Symposium.

[2]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[3]  Joydeep Ghosh,et al.  Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis , 2002, Pattern Analysis & Applications.

[4]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[5]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[6]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[7]  Rebecca Fay,et al.  Feature selection and information fusion in hierarchical neural networks for iterative 3D-object recognition , 2007 .

[8]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[9]  Günther Palm,et al.  Semi-supervised Learning for Regression with Co-training by Committee , 2009, ICANN.

[10]  Sanjoy Dasgupta,et al.  Hybrid Hierarchical Clustering: Forming a Tree From Multiple Views , 2005 .

[11]  Zhi-Hua Zhou,et al.  Semi-Supervised Regression with Co-Training , 2005, IJCAI.

[12]  Günther Palm,et al.  Hierarchical Neural Networks Utilising Dempster-Shafer Evidence Theory , 2006, ANNPR.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[15]  Friedhelm Schwenker,et al.  Three learning phases for radial-basis-function networks , 2001, Neural Networks.

[16]  Goo Jun,et al.  Hybrid Hierarchical Classifiers for Hyperspectral Data Analysis , 2009, MCS.

[17]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[18]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[19]  Joydeep Ghosh,et al.  An Empirical Comparison of Hierarchical vs. Two-Level Approaches to Multiclass Problems , 2004, Multiple Classifier Systems.

[20]  Friedhelm Schwenker,et al.  Decision Templates Based RBF Network for Tree-Structured Multiple Classifier Fusion , 2009, MCS.

[21]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[22]  Tong Tang,et al.  Proceedings of the European Symposium on Artificial Neural Networks , 2006 .

[23]  Rayid Ghani,et al.  Combining labeled and unlabeled data for text classification with a large number of categories , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[24]  Stan Matwin,et al.  Email classification with co-training , 2011, CASCON.

[25]  Zhi-Hua Zhou,et al.  Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[26]  Joydeep Ghosh,et al.  A Hierarchical Multiclassifier System for Hyperspectral Data Analysis , 2000, Multiple Classifier Systems.

[27]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[28]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[29]  Goo Jun,et al.  Multi-class Boosting with Class Hierarchies , 2009, MCS.

[30]  Tomas Hrycej Modular learning in neural networks - a modularized approach to neural network classification , 1992, Sixth-Generation computer technology series.

[31]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[32]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[33]  Joydeep Ghosh,et al.  Modular learning through output space decomposition , 2000 .

[34]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[35]  Mike Bauer,et al.  Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative Research, November 5-7, 2001, Toronto, Ontario, Canada , 2001, CASCON.

[36]  Arthur P. Dempster,et al.  A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[37]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[38]  Günther Palm,et al.  Semi-supervised Learning of Tree-Structured RBF Networks Using Co-training , 2008, ICANN.

[39]  Günther Palm,et al.  Multi-view forests based on Dempster-Shafer evidence theory: a new classifier ensemble method , 2008 .

[40]  Daoqiang Zhang,et al.  Semi-Supervised Dimensionality Reduction ∗ , 2007 .

[41]  Craig A. Knoblock,et al.  Selective Sampling with Redundant Views , 2000, AAAI/IAAI.

[42]  Robert Tibshirani,et al.  Margin Trees for High-dimensional Classification , 2007, J. Mach. Learn. Res..