Multiple Classifier Systems

While a variety of multiple classifier systems have been studied since at least the late 1950’s, this area came alive in the 90’s with significant theoretical advances as well as numerous successful practical applications. This article argues that our current understanding of ensemble-type multiclassifier systems is now quite mature and exhorts the reader to consider a broader set of models and situations for further progress. Some of these scenarios have already been considered in classical pattern recognition literature, but revisiting them often leads to new insights and progress. As an example, we consider how to integrate multiple clusterings, a problem central to several emerging distributed data mining applications. We also revisit output space decomposition to show how this can lead to extraction of valuable domain knowledge in addition to improved classification accuracy. 1 A Brief History of Multilearner Systems Multiple classifier systems are special cases of approaches that integrate several data-driven models for the same problem. A key goal is to obtain a better composite global model, with more accurate and reliable estimates or decisions. In addition, modular approaches often decompose a complex problem into subproblems for which the solutions obtained are simpler to understand, as well as to implement, manage and update. Multilearner systems have have a rather long and interesting history. For example, Borda counts for combining multiple rankings are named after its 18th century French inventor, Jean-Charles de Borda. Early notable systems include Selfridge’s Pandemonium [1], a model of human information processing involving multiple demons. Each demon was specialized for detecting specific features or classes. A head-demon (the combiner) would select the demon that “shouted the loudest”, a scheme that is nowadays called a “winner-take-all” solution. Nilsson’s committee machine [2] combined several linear two-class models to solve a multiclass problem. A strong motivation for multilearner systems was voiced by Kanal in his classic 1974 paper [3]: “It is now recognized that the key to pattern recognition problems does not lie wholly in learning machines, statistical approaches, spatial, filtering,..., or in any other particular solution which has been vigorously F. Roli and J. Kittler (Eds.): MCS 2002, LNCS 2364, pp. 1–15, 2002. c © Springer-Verlag Berlin Heidelberg 2002

[1]  Y. Miyake,et al.  Facial pattern detection and color correction from television picture for newspaper printing , 1990 .

[2]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Ronald J. Patton,et al.  Interpretation of Trained Neural Networks by Rule Extraction , 2001, Fuzzy Days.

[5]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[6]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  E Benfenati,et al.  Factors Influencing Predictive Models for Toxicology , 2001, SAR and QSAR in environmental research.

[8]  Arun Ross,et al.  Information fusion in biometrics , 2003, Pattern Recognit. Lett..

[9]  David Windridge,et al.  An Optimal Solution to the Problem of Multiple Expert Fusion , 2000 .

[10]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[11]  Raimondo Schettini,et al.  Using a Relevance Feedback Mechanism to Improve Content-Based Image Retrieval , 1999, VISUAL.

[12]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[13]  Mario Vento,et al.  Classifying audio of movies by a multi-expert system , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[14]  Roberto Brunelli,et al.  Person identification using multiple cues , 1995, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  L. Sobin,et al.  World Health Organization classification of tumors , 2000, Cancer.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  C. Frankel,et al.  Distinguishing photographs and graphics on the World Wide Web , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[18]  Fabio Roli,et al.  Dynamic Classifier Selection , 2000, Multiple Classifier Systems.

[19]  Lakhmi C. Jain,et al.  Designing classifier fusion systems by genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[20]  Kagan Tumer,et al.  Linear and Order Statistics Combiners for Pattern Classification , 1999, ArXiv.

[21]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[22]  Daniel Boley,et al.  Clustering and classification techniques to assess aquatic toxicity , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[23]  David Windridge,et al.  Classifier Combination as a Tomographic Process , 2001, Multiple Classifier Systems.

[24]  Jaume Pujol,et al.  Progressive classification scheme for document layout recognition , 1999, Optics & Photonics.

[25]  Saso Dzeroski,et al.  Combining Multiple Models with Meta Decision Trees , 2000, PKDD.

[26]  Derek Partridge,et al.  Software Diversity: Practical Statistics for Its Measurement and Exploitation | Draft Currently under Revision , 1996 .

[27]  C. Helma,et al.  Statistical Methods in Medical Research Knowledge Discovery and Data Mining in Toxicology , 2022 .

[28]  Joydeep Ghosh,et al.  A Hierarchical Multiclassifier System for Hyperspectral Data Analysis , 2000, Multiple Classifier Systems.

[29]  Cesare Furlanello,et al.  Boosting of Tree-Based Classifiers for Predictive Risk Modeling in GIS , 2000, Multiple Classifier Systems.

[30]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[31]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[32]  Mario Vento,et al.  Dialogue Scenes Detection in MPEG Movies: A Multi-expert Approach , 2001, MDIC.

[33]  Ching Y. Suen,et al.  A method of combining multiple classifiers-a neural network approach , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[34]  Fuad Rahman,et al.  A new hybrid approach in combining multiple experts to recognise handwritten numerals , 1997, Pattern Recognit. Lett..

[35]  Josef Kittler,et al.  Improving the performance of the product fusion strategy , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[36]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[37]  Robert P. W. Duin,et al.  Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix , 1998, Pattern Recognit. Lett..

[38]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[39]  Nathan Intrator,et al.  Boosted Mixture of Experts: An Ensemble Learning Scheme , 1999, Neural Computation.

[40]  Paul Scheunders,et al.  Wavelet-based Texture Analysis , 1998 .

[41]  Johannes R. Sveinsson,et al.  Boosting, Bagging, and Consensus Based Classification of Multisource Remote Sensing Data , 2001, Multiple Classifier Systems.

[42]  Raimondo Schettini,et al.  Content-based image classification , 1999, Electronic Imaging.

[43]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[44]  Tin Kam Ho Data Complexity Analysis for Classifier Combination , 2001, Multiple Classifier Systems.

[45]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[46]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[47]  Vasile Palade,et al.  Neural and Neuro-Fuzzy Integration in a Knowledge-Based System for Air Quality Prediction , 2002, Applied Intelligence.

[48]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[49]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[50]  Bev Littlewood,et al.  Conceptual Modeling of Coincident Failures in Multiversion Software , 1989, IEEE Trans. Software Eng..

[51]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[52]  Horst Bunke,et al.  Hidden Markov model length optimization for handwriting recognition systems , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[53]  Alberto Del Bimbo,et al.  Content-based indexing and retrieval of TV news , 2001, Pattern Recognit. Lett..

[54]  Robert P. W. Duin,et al.  Stabilizing classifiers for very small sample sizes , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[55]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[56]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[57]  Yoram Singer,et al.  Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.

[58]  Ching Y. Suen,et al.  A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Norman Poh,et al.  Hybrid Biometric Person Authentication Using Face and Voice Features , 2001, AVBPA.

[60]  Stanley Boykin,et al.  Machine learning of event segmentation for news on demand , 2000, CACM.

[61]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[62]  David A. Landgrebe,et al.  Hyperspectral Image Data Analysis as a High Dimensional Signal Processing Problem , 2002 .

[63]  Pedro M. Domingos A Unified Bias-Variance Decomposition for Zero-One and Squared Loss , 2000, AAAI/IAAI.

[64]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Ke Chen,et al.  A method of combining multiple probabilistic classifiers through soft competition on different feature sets , 1998, Neurocomputing.

[66]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[67]  Bernard Zenko,et al.  A comparison of stacking with meta decision trees to bagging, boosting, and stacking with other methods , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[68]  Johannes Fürnkranz,et al.  An Evaluation of Grading Classifiers , 2001, IDA.

[69]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[70]  John A. Richards,et al.  Segmented principal components transformation for efficient hyperspectral remote-sensing image display and classification , 1999, IEEE Trans. Geosci. Remote. Sens..

[71]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[72]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[73]  Emilio Benfenati,et al.  COMET: the approach of a project in evaluating toxicity , 1999 .

[74]  Josef Kittler,et al.  Multiple Classifier Systems , 2004, Lecture Notes in Computer Science.

[75]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[76]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[77]  Fabio Roli,et al.  Methods for dynamic classifier selection , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[78]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[79]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[80]  Jiri Matas,et al.  On Matching Scores for LDA-based Face Verification , 2000, BMVC.

[81]  Bogdan Gabrys,et al.  Learning hybrid neuro-fuzzy classifier models from data: to combine or not to combine? , 2004, Fuzzy Sets Syst..

[82]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[83]  Robert Tibshirani,et al.  Bias, Variance and Prediction Error for Classification Rules , 1996 .

[84]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[85]  Naonori Ueda,et al.  Optimal Linear Combination of Neural Networks for Improving Classification Performance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[86]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[87]  Robert P. W. Duin,et al.  Classifier Conditional Posterior Probabilities , 1998, SSPR/SPR.

[88]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[89]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[90]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[91]  David A. Landgrebe,et al.  Covariance estimation with limited training samples , 1999, IEEE Trans. Geosci. Remote. Sens..

[92]  Robert P. W. Duin,et al.  Spatial Representation of Dissimilarity Data via Lower-Complexity Linear and Nonlinear Mappings , 2002, SSPR/SPR.

[93]  Gian Luca Marcialis,et al.  Complexity of Data Subsets Generated by the Random Subspace Method: An Experimental Investigation , 2001, Multiple Classifier Systems.

[94]  A. R. Newman Electronic noses. , 1991, Analytical chemistry.

[95]  K. Sirlantzis,et al.  Investigation of a novel self-configurable multiple classifier system for character recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[96]  Bogdan Gabrys,et al.  Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems , 2001 .

[97]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[98]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[99]  J. van Leeuwen,et al.  Audio- and Video-Based Biometric Person Authentication , 2001, Lecture Notes in Computer Science.

[100]  Yoshua Bengio,et al.  Training Methods for Adaptive Boosting of Neural Networks , 1997, NIPS.

[101]  Horst Bunke,et al.  Lipreading: A classifier combination approach , 1997, Pattern Recognit. Lett..

[102]  Fabio Roli,et al.  Performance Analysis and Comparison of Linear Combiners for Classifier Fusion , 2002, SSPR/SPR.

[103]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[104]  Fabio Roli,et al.  Multisensor Image Recognition by Neural Networks with Understandable Behavior , 1996, Int. J. Pattern Recognit. Artif. Intell..

[105]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[106]  Cesare Furlanello,et al.  Tuning Cost-Sensitive Boosting and Its Application to Melanoma Diagnosis , 2001, Multiple Classifier Systems.

[107]  Alan Hanjalic,et al.  Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[108]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[109]  Robert King,et al.  Textural features corresponding to textural properties , 1989, IEEE Trans. Syst. Man Cybern..

[110]  Y.P. Kahya,et al.  Hierarchical classification of respiratory sounds , 1998, Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Vol.20 Biomedical Engineering Towards the Year 2000 and Beyond (Cat. No.98CH36286).

[111]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[112]  Yong Wang,et al.  Using Model Trees for Classification , 1998, Machine Learning.

[113]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[114]  Mark D. Bedworth,et al.  High level data fusion , 1999 .

[115]  Jakob Vogdrup Hansen,et al.  Combining Predictors: Comparison of Five Meta Machine Learning Methods , 1999, Inf. Sci..

[116]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[117]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[118]  Virginia R. de Sa,et al.  Learning Classification with Unlabeled Data , 1993, NIPS.

[119]  Christopher J. Merz,et al.  Using Correspondence Analysis to Combine Classifiers , 1999, Machine Learning.

[120]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[121]  Bogdan Gabrys,et al.  Combining neuro-fuzzy classifiers for improved generalisation and reliability , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[122]  Josef Kittler,et al.  An Experimental Comparison of Classifier Fusion Rules for Multimodal Personal Identity Verification Systems , 2002, Multiple Classifier Systems.

[123]  Chin-Teng Lin,et al.  Neural-Network-Based Fuzzy Logic Control and Decision System , 1991, IEEE Trans. Computers.

[124]  P. Gallinari,et al.  Modular neural net systems, training of , 1998 .

[125]  Naoki Hara,et al.  Fuzzy rule extraction from a multilayered neural network , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[126]  Fuad Rahman,et al.  Machine-printed character recognition revisited: re-application of recent advances in handwritten character recognition research , 1998, Image Vis. Comput..

[127]  Ian H. Witten,et al.  Induction of model trees for predicting continuous classes , 1996 .

[128]  Mohamed S. Kamel,et al.  Modular Neural Network Classifiers: A Comparative Study , 1998, J. Intell. Robotic Syst..

[129]  Noel E. Sharkey,et al.  A Multi-Net System for the Fault Diagnosis of a Diesel Engine , 2000, Neural Computing & Applications.

[130]  Lorenzo Bruzzone,et al.  Combination of neural and statistical algorithms for supervised classification of remote-sensing image , 2000, Pattern Recognit. Lett..

[131]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[132]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[133]  Michael C. Fairhurst,et al.  Genetic Algorithms for Multi-classifier System Configuration: A Case Study in Character Recognition , 2001, Multiple Classifier Systems.

[134]  Ching Y. Suen,et al.  Optimal combinations of pattern classifiers , 1995, Pattern Recognition Letters.

[135]  Tom Michael Mitchell,et al.  The Role of Unlabeled Data in Supervised Learning , 2004 .

[136]  H. Gish,et al.  Text-independent speaker identification , 1994, IEEE Signal Processing Magazine.

[137]  Josef Kittler,et al.  A Framework for Classifier Fusion: Is It Still Needed? , 2000, SSPR/SPR.

[138]  E. Mayoraz,et al.  Fusion of face and speech data for person identity verification , 1999, IEEE Trans. Neural Networks.

[139]  Jiri Matas,et al.  Audio-visual person verification , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[140]  M. Skurichina,et al.  Stabilizing weak classifiers , 2001 .

[141]  Horst Bunke,et al.  A full English sentence database for off-line handwriting recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[142]  Ahmad Fuad Rezaur Rahman,et al.  Automatic self-configuration of a novel multiple-expert classifier using a genetic algorithm , 1999 .

[143]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[144]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[145]  Sanjeev R. Kulkarni,et al.  Rapid estimation of camera motion from compressed video with application to video annotation , 2000, IEEE Trans. Circuits Syst. Video Technol..

[146]  Robert P. W. Duin,et al.  Is independence good for combining classifiers? , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[147]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[148]  Ke Chen,et al.  Methods of Combining Multiple Classifiers with Different Features and Their Applications to Text-Independent Speaker Identification , 1997, Int. J. Pattern Recognit. Artif. Intell..

[149]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[150]  Stefan Fischer,et al.  Person Authentication by Fusing Face and Speech Information , 1997, AVBPA.

[151]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[152]  Simon M. Lucas,et al.  Recognition of chain-coded handwritten character images with scanning n-tuple method , 1995 .

[153]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[154]  Roberto Battiti,et al.  Democracy in neural nets: Voting schemes for classification , 1994, Neural Networks.

[155]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[156]  Herbert Freeman,et al.  Computer Processing of Line-Drawing Images , 1974, CSUR.

[157]  Horst Bunke,et al.  Combination of face classifiers for person identification , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[158]  Soo-Chang Pei,et al.  Efficient MPEG Compressed Video Analysis Using Macroblock Type Information , 1999, IEEE Trans. Multim..

[159]  Arnold W. M. Smeulders,et al.  PicToSeek: combining color and shape invariant features for image retrieval , 2000, IEEE Trans. Image Process..

[160]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[161]  Ching Y. Suen,et al.  Multiple Classifier Combination Methodologies for Different Output Levels , 2000, Multiple Classifier Systems.

[162]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[163]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[164]  Fuad Rahman,et al.  An Evaluation Of Multi-Expert Configurations For The Recognition Of Handwritten Numerals , 1998, Pattern Recognit..

[165]  Rui Zhang,et al.  Adaptive confidence transform based classifier combination for Chinese character recognition , 1998, Pattern Recognit. Lett..

[166]  Sarunas Raudys,et al.  Evolution and generalization of a single neurone: I. Single-layer perceptron as seven statistical classifiers , 1998, Neural Networks.

[167]  William G. Baxt,et al.  Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks , 1992, Neural Computation.

[168]  J.-C. Simon,et al.  Off-line cursive word recognition , 1992, Proc. IEEE.

[169]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[170]  Ingemar Lundström,et al.  Data preprocessing enhances the classification of different brands of Espresso coffee with an electronic nose , 2000 .

[171]  Robert P. W. Duin,et al.  Experiments with Classifier Combining Rules , 2000, Multiple Classifier Systems.

[172]  Anastasios Tefas,et al.  Morphological elastic graph matching applied to frontal face authentication under well-controlled and real conditions , 2000, Pattern Recognit..

[173]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[174]  Michael Fairhurst,et al.  Moving window classifier: approach to off-line image recognition , 2000 .

[175]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[176]  Anil K. Jain,et al.  Reject option for VQ-based Bayesian classification , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[177]  R. Jenssen,et al.  1 THE HYMAP TM AIRBORNE HYPERSPECTRAL SENSOR : THE SYSTEM , CALIBRATION AND PERFORMANCE , 1998 .

[178]  Yann LeCun,et al.  Transforming Neural-Net Output Levels to Probability Distributions , 1990, NIPS.

[179]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[180]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[181]  Fabio Roli,et al.  Analysis of Linear and Order Statistics Combiners for Fusion of Imbalanced Classifiers , 2002, Multiple Classifier Systems.

[182]  Giuseppina C. Gini,et al.  Mixing a Symbolic and a Subsymbolic Expert to Improve Carcinogenicity Prediction of Aromatic Compounds , 2001, Multiple Classifier Systems.

[183]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[184]  Josef Kittler,et al.  Combining multiple classifiers by averaging or by multiplying? , 2000, Pattern Recognit..

[185]  Robert P. W. Duin,et al.  K-nearest Neighbors Directed Noise Injection in Multilayer Perceptron Training , 2000, IEEE Trans. Neural Networks Learn. Syst..

[186]  Gian Luca Marcialis,et al.  An Experimental Comparison of Fixed and Trained Fusion Rules for Crisp Classifier Outputs , 2002, Multiple Classifier Systems.

[187]  Nathan Intrator,et al.  Automatic model selection in a hybrid perceptron/radial network , 2001, Inf. Fusion.

[188]  Mario Vento,et al.  Reliability Parameters to Improve Combination Strategies in Multi-Expert Systems , 1999, Pattern Analysis & Applications.

[189]  Shih-Fu Chang,et al.  A highly efficient system for automatic face region detection in MPEG video , 1997, IEEE Trans. Circuits Syst. Video Technol..

[190]  Jon Rigelsford Evolutionary Robotics: The Biology, Intelligence, and Technology of Self-organising Machines , 2001 .

[191]  Juergen Luettin,et al.  Evaluation Protocol for the extended M2VTS Database (XM2VTSDB) , 1998 .

[192]  Mübeccel Demirekler,et al.  An information theoretic framework for weight estimation in the combination of probabilistic classifiers for speaker identification , 2000, Speech Commun..

[193]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[194]  David Windridge,et al.  Combined Classifier Optimisation via Feature Selection , 2000, SSPR/SPR.

[195]  Ilona Jagielska,et al.  An investigation into the application of neural networks, fuzzy logic, genetic algorithms, and rough sets to automated knowledge acquisition for classification problems , 1999, Neurocomputing.

[196]  Tom M. Mitchell,et al.  Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.

[197]  Ian H. Witten,et al.  Stacked generalization: when does it work? , 1997, IJCAI 1997.

[198]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.