Learning from Crowds in Multi-dimensional Classification Domains

Learning from crowds is a recently fashioned supervised classification framework where the true/real labels of the training instances are not available. However, each instance is provided with a set of noisy class labels, each indicating the class-membership of the instance according to the subjective opinion of an annotator. The additional challenges involved in the extension of this framework to the multi-label domain are explored in this paper. A solution to this problem combining a Structural EM strategy and the multi-dimensional Bayesian network models as classifiers is presented.

[1]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Ben Taskar,et al.  Learning from Partial Labels , 2011, J. Mach. Learn. Res..

[4]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[5]  Xindong Wu,et al.  Eliminating Class Noise in Large Datasets , 2003, ICML.

[6]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[7]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[8]  Eyke Hüllermeier,et al.  Computational Intelligence for Knowledge-Based Systems Design, 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010, Dortmund, Germany, June 28 - July 2, 2010. Proceedings , 2010, IPMU.

[9]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[10]  S. Sathiya Keerthi,et al.  Regularized Structured Output Learning with Partial Labels , 2012, SDM.

[11]  José Antonio Lozano,et al.  Using Multidimensional Bayesian Network Classifiers to Assist the Treatment of Multiple Sclerosis , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  Concha Bielza,et al.  Multi-dimensional classification with Bayesian networks , 2011, Int. J. Approx. Reason..

[13]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[14]  Milos Hauskrecht,et al.  Learning Classification with Auxiliary Probabilistic Information , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Concha Bielza,et al.  Bayesian network modeling of the consensus between experts: An application to neuron classification , 2014 .

[16]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[17]  Thierry Denoeux,et al.  Evidential Multi-Label Classification Approach to Learning from Data with Imprecise Labels , 2010, IPMU.

[18]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..