Learning Bayesian network classifiers from label proportions

This paper deals with a classification problem known as learning from label proportions. The provided dataset is composed of unlabeled instances and is divided into disjoint groups. General class information is given within the groups: the proportion of instances of the group that belong to each class. We have developed a method based on the Structural EM strategy that learns Bayesian network classifiers to deal with the exposed problem. Four versions of our proposal are evaluated on synthetic data, and compared with state-of-the-art approaches on real datasets from public repositories. The results obtained show a competitive behavior for the proposed algorithm.

[1]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[2]  Alexander J. Smola,et al.  Estimating labels from label proportions , 2008, ICML '08.

[3]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[4]  Pedro Larrañaga,et al.  Learning Bayesian classifiers from positive and unlabeled examples , 2007, Pattern Recognit. Lett..

[5]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[6]  Moninder Singh,et al.  Learning Bayesian Networks from Incomplete Data , 1997, AAAI/IAAI.

[7]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[8]  A. J. Feelders,et al.  Learning Bayesian Network Models from Incomplete Data using Importance Sampling , 2005, AISTATS.

[9]  Stefan Rüping,et al.  SVM Classifier Estimation from Group Probabilities , 2010, ICML.

[10]  Pedro Larrañaga,et al.  Selection of human embryos for transfer by Bayesian classifiers , 2008, Comput. Biol. Medicine.

[11]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[12]  Katharina Morik,et al.  Learning from Label Proportions by Optimizing Cluster Model Selection , 2011, ECML/PKDD.

[13]  Stephen P. Brooks,et al.  Markov chain Monte Carlo method and its application , 1998 .

[14]  Man Leung Wong,et al.  Learning Bayesian networks from incomplete databases using a novel evolutionary algorithm , 2008, Decis. Support Syst..

[15]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[16]  Carsten Riggelsen,et al.  Learning Bayesian Networks from Incomplete Data: An Efficient Method for Generating Approximate Predictive Distributions , 2006, SDM.

[17]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[18]  David R. Musicant,et al.  Supervised Learning by Training on Aggregate Outputs , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[19]  Nando de Freitas,et al.  Learning about Individuals from Group Statistics , 2005, UAI.

[20]  Paola Sebastiani,et al.  Parameter Estimation in Bayesian Networks from Incomplete Databases , 1998, Intell. Data Anal..

[21]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Gal Elidan,et al.  Bagged Structure Learning of Bayesian Network , 2011, AISTATS.

[24]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[25]  Bin Liu,et al.  Kernel K-means Based Framework for Aggregate Outputs Classification , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[26]  Iñaki Inza,et al.  Learning Naive Bayes Models for Multiple-Instance Learning with Label Proportions , 2011, CAEPIA.

[27]  Paola Sebastiani,et al.  Learning Bayesian Networks from Incomplete Databases , 1997, UAI.

[28]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[29]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.