论文信息 - Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers through Discrimination of High-Confidence Labels - 字舞流文

Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers through Discrimination of High-Confidence Labels

Bayesian network classifiers (BNCs) have demonstrated competitive classification accuracy in a variety of real-world applications. However, it is error-prone for BNCs to discriminate among high-confidence labels. To address this issue, we propose the label-driven learning framework, which incorporates instance-based learning and ensemble learning. For each testing instance, high-confidence labels are first selected by a generalist classifier, e.g., the tree-augmented naive Bayes (TAN) classifier. Then, by focusing on these labels, conditional mutual information is redefined to more precisely measure mutual dependence between attributes, thus leading to a refined generalist with a more reasonable network structure. To enable finer discrimination, an expert classifier is tailored for each high-confidence label. Finally, the predictions of the refined generalist and the experts are aggregated. We extend TAN to LTAN (Label-driven TAN) by applying the proposed framework. Extensive experimental results demonstrate that LTAN delivers superior classification accuracy to not only several state-of-the-art single-structure BNCs but also some established ensemble BNCs at the expense of reasonable computation overhead.

Yi Sun | Limin Wang | Minghui Sun | Yi Sun | Limin Wang | Minghui Sun

[1] A. Asuncion,et al. UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[2] Geoffrey I. Webb,et al. Subsumption resolution: an efficient and effective technique for semi-naive Bayesian learning , 2012, Machine Learning.

[3] Bojan Cestnik,et al. Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[4] Hui Liu,et al. A new hybrid method for learning bayesian networks: Separation and reunion , 2017, Knowl. Based Syst..

[5] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[6] Zygmunt Hasiewicz,et al. Risk upper bound for a NM-type multiresolution classification scheme of random signals by Daubechies wavelets , 2017, Eng. Appl. Artif. Intell..

[7] Hui Yang,et al. Interpreting out-of-control signals using instance-based bayesian classifier in multivariate statistical process control , 2017, Commun. Stat. Simul. Comput..

[8] Giuliano Grossi,et al. Robust Face Recognition Providing the Identity and Its Reliability Degree Combining Sparse Representation and Multiple Features , 2016, Int. J. Pattern Recognit. Artif. Intell..

[9] Geoffrey I. Webb,et al. Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[10] Geoffrey I. Webb,et al. Selective AnDE for large data learning: a low-bias memory constrained approach , 2017, Knowledge and Information Systems.

[11] Yong Fan,et al. Feature selection by optimizing a lower bound of conditional mutual information , 2017, Inf. Sci..

[12] Geoffrey I. Webb,et al. Scalable Learning of Bayesian Network Classifiers , 2016, J. Mach. Learn. Res..

[13] Terry Windeatt,et al. Pruning of Error Correcting Output Codes by optimization of accuracy–diversity trade off , 2014, Machine Learning.

[14] R. Prim. Shortest connection networks and some generalizations , 1957 .

[15] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[16] Mehran Sahami,et al. Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[17] Shuicheng Yan,et al. Age Estimation via Grouping and Decision Fusion , 2015, IEEE Transactions on Information Forensics and Security.

[18] Subhadip Basu,et al. Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach , 2015, Pattern Recognit..

[19] Sunita Sarawagi,et al. Scaling multi-class support vector machines using inter-class confusion , 2002, KDD.

[20] James Cussens,et al. Integer Linear Programming for the Bayesian network structure learning problem , 2017, Artif. Intell..

[21] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.

[22] 孙铭会,et al. General and Local: Averaged k-Dependence Bayesian Classifiers. , 2015 .

[23] M. Friedman. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[24] Nir Friedman,et al. Bayesian Network Classifiers , 1997, Machine Learning.

[25] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[26] Jukka Corander,et al. The role of local partial independence in learning of Bayesian networks , 2016, Int. J. Approx. Reason..

[27] Concha Bielza,et al. Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[28] Kaizhu Huang,et al. Discriminative training of Bayesian Chow-Liu multinet classifiers , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[29] Geoffrey I. Webb,et al. Alleviating naive Bayes attribute independence assumption by attribute weighting , 2013, J. Mach. Learn. Res..

[30] Juan José Rodríguez Diez,et al. Tree ensemble construction using a GRASP-based heuristic and annealed randomness , 2014, Inf. Fusion.

[31] David Heckerman,et al. Knowledge Representation and Inference in Similarity Networks and Bayesian Multinets , 1996, Artif. Intell..

[32] R. E. Lee,et al. Distribution-free multiple comparisons between successive treatments , 1995 .

[33] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[34] Liangxiao Jiang,et al. Improving Tree augmented Naive Bayes for class probability estimation , 2012, Knowl. Based Syst..