Combining Classifiers and Learning Mixture-of-Experts

Expert combination is a classic strategy that has been widely used in various problem solving tasks. A team of individuals with diverse and complementary skills tackle a task jointly such that a performance better than any single individual can make is achieved via integrating the strengths of individuals. Started from the late 1980’ in the handwritten character recognition literature, studies have been made on combining multiple classifiers. Also from the early 1990’ in the fields of neural networks and machine learning, efforts have been made under the name of ensemble learning or mixture of experts on how to learn jointly a mixture of experts (parametric models) and a combining strategy for integrating them in an optimal sense. The article aims at a general sketch of two streams of studies, not only with a re-elaboration of essential tasks, basic ingredients, and typical combining rules, but also with a general combination framework (especially one concise and more useful one-parameter modulated special case, called a-integration) suggested to unify a number of typical classifier combination rules and several mixture based learning models, as well as max rule and min rule used in the literature on fuzzy system.

[1]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Chi-Hoon Lee,et al.  Using query-specific variance estimates to combine Bayesian classifiers , 2006, ICML '06.

[3]  C.Y. Suen,et al.  Associative switch for combining multiple classifiers , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[4]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[5]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[6]  Lei Xu,et al.  Best Harmony, Unified RPCL and Automated Model Selection for Unsupervised and Supervised Learning on Gaussian Mixtures, Three-Layer Nets and ME-RBF-SVM Models , 2001, Int. J. Neural Syst..

[7]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  S. Amari Integration of Stochastic Models by Minimizing -Divergence , 2007, Neural Computation.

[9]  Feng-Chia Li,et al.  Comparison of the Hybrid Credit Scoring Models Based on Various Classifiers , 2010, Int. J. Intell. Inf. Technol..

[10]  Michael I. Jordan,et al.  Convergence results for the EM approach to mixtures of experts architectures , 1995, Neural Networks.

[11]  T. V. Geetha,et al.  Requirements Elicitation by Defect Elimination: An Indian Logic Perspective , 2009, Int. J. Softw. Sci. Comput. Intell..

[12]  Vijayan Sugumaran Intelligent Information Technologies: Concepts, Methodologies, Tools and Applications , 2007 .

[13]  Mohamed A. Deriche,et al.  A New Technique for Combining Multiple Classifiers using The Dempster-Shafer Theory of Evidence , 2002, J. Artif. Intell. Res..

[14]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[15]  R. Jackson Inequalities , 2007, Algebra for Parents.

[16]  Tshilidzi Marwala,et al.  Online Approaches to Missing Data Estimation , 2009 .

[17]  Mohamed Salah Hamdi SOMSE: A Neural Network Based Approach to Web Search Optimization , 2008, Int. J. Intell. Inf. Technol..

[18]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  William G. Baxt,et al.  Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks , 1992, Neural Computation.

[20]  Jeffrey J. P. Tsai,et al.  Adding Context into an Access Control Model for Computer Security Policy , 2007 .

[21]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[22]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[23]  Alejandro Pazos Sierra,et al.  Encyclopedia of Artificial Intelligence , 2008 .

[24]  Hironori Hiraishi,et al.  Qualitative Reasoning Approach to a Driver's Cognitive Mental Load , 2011, Int. J. Softw. Sci. Comput. Intell..

[25]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[26]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[27]  Erkki Oja,et al.  A new curve detection method: Randomized Hough transform (RHT) , 1990, Pattern Recognit. Lett..

[28]  Chris D. Nugent,et al.  Smart Home Research: Projects and Issues , 2009, Int. J. Ambient Comput. Intell..

[29]  Guanrong Chen,et al.  Fuzzy Control Systems: An Introduction , 2009 .

[30]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[31]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[32]  Gregory M. P. O'Hare,et al.  Mobile Multimedia: Reflecting on Dynamic Service Provision , 2010, Int. J. Ambient Comput. Intell..

[33]  Tshilidzi Marwala,et al.  Computational Intelligence for Missing Data Imputation, Estimation, and Management - Knowledge Optimization Techniques , 2009, Computational Intelligence for Missing Data Imputation, Estimation, and Management.

[34]  Vasile Palade,et al.  Multi-Classifier Systems: Review and a roadmap for developers , 2006, Int. J. Hybrid Intell. Syst..

[35]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Stuart J. Barnes,et al.  Customer Perceptions Toward Mobile Games Delivered via the Wireless Application Protocol , 2008 .

[37]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[38]  Witold Kinsner,et al.  Challenges in the Design of Adaptive, Intelligent and Cognitive Systems , 2007, 6th IEEE International Conference on Cognitive Informatics.

[39]  Lei Xu,et al.  A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving , 2007, Pattern Recognit..

[40]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[41]  Geoffrey E. Hinton,et al.  An Alternative Model for Mixtures of Experts , 1994, NIPS.

[42]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[43]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[44]  Carlos Fernández-Llatas,et al.  Enjoy.IT!: A Platform to Integrate Entertainment Services , 2011, Int. J. Ambient Comput. Intell..

[45]  V. Sugumaran The Inaugural Issue of the International Journal of Intelligent Information Technologies , 2005 .

[46]  Lei Xu,et al.  RBF nets, mixture experts, and Bayesian Ying-Yang learning , 1998, Neurocomputing.

[47]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[49]  Josef Kittler,et al.  Combining classifiers: A theoretical framework , 1998, Pattern Analysis and Applications.