A Mixed Ensemble Approach for the Semi-supervised Problem

In this paper we introduce a mixed approach for the semi-supervised data problem. Our approach consists of an ensemble unsupervised learning part where the labeled and unlabeled points are segmented into clusters. Continuing, we take advantage of the a priori information of the labeled points to assign classes to clusters and proceed to predicting with the ensemble method new incoming ones. Thus, we can finally conclude classifying new data points according to the segmentation of the whole set and the association of its clusters to the classes.

[1]  C. Sitthi-amorn,et al.  Bias , 1993, The Lancet.

[2]  Tom Michael Mitchell,et al.  The Role of Unlabeled Data in Supervised Learning , 2004 .

[3]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[4]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[5]  Yan Zhou,et al.  Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[6]  Bernd Fritzke,et al.  Some Competitive Learning Methods Contents 1 Introduction 3 2 Common Properties & Notational Conventions 4 3 Goals of Competitive Learning 7 , 1997 .

[7]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[8]  Kamal Nigam,et al.  Understanding the Behavior of Co-training , 2000, KDD 2000.

[9]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[10]  G. Wahba Spline models for observational data , 1990 .

[11]  S. Griffis EDITOR , 1997, Journal of Navigation.

[12]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[13]  Suzanna Becker,et al.  JPMAX: Learning to Recognize Moving Objects as a Model-fitting Problem , 1994, NIPS.

[14]  M. Seeger Learning with labeled and unlabeled dataMatthias , 2001 .

[15]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[16]  G. McLachlan,et al.  Small sample results for a linear discriminant function estimated from a mixture of normal populations , 1979 .

[17]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[18]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[19]  J. Anderson Multivariate logistic compounds , 1979 .

[20]  Terence J. O'Neill Normal Discrimination with Unclassified Observations , 1978 .

[21]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[22]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[23]  Daphne Koller,et al.  Restricted Bayes Optimal Classifiers , 2000, AAAI/IAAI.

[24]  David J. Miller,et al.  A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data , 1996, NIPS.

[25]  David A. Landgrebe,et al.  The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon , 1994, IEEE Trans. Geosci. Remote. Sens..

[26]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[27]  Christophe Ambroise,et al.  Boosting Mixture Models for Semi-supervised Learning , 2001, ICANN.

[28]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[29]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[30]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[31]  M. Seeger Input-dependent Regularization of Conditional Density Models , 2000 .

[32]  Kurt Hornik,et al.  A Combination Scheme for Fuzzy Clustering , 2002, AFSS.

[33]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[34]  Tong Zhang,et al.  The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.