A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems

It is shown that the computational behaviour of a hierarchical sorting-strategy depends on three properties, which are established for five conventional strategies and four measures. The conventional strategies are shown to be simple variants of a single linear system defined by four parameters. A new strategy is defined, enabling continuous variation of intensity of grouping by variation in a single parameter. An Appendix provides specifications of computer programs embodying the new principles.

[1]  C. Spearman CORRELATIONS OF SUMS OR DIFFERENCES , 1913 .

[2]  Louis L. McQuitty AGREEMENT ANALYSIS: CLASSIFYING PERSONS BY PREDOMINANT PATTERNS OF RESPONSES1 , 1956 .

[3]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[4]  L. A. Goodman,et al.  Measures of Association for Cross Classifications. II: Further Discussion and References , 1959 .

[5]  Roger M. Needham,et al.  A Method for Using Computers in Information Classification , 1962, IFIP Congress.

[6]  R. Sokal,et al.  Principles of numerical taxonomy , 1965 .

[7]  L. A. Goodman,et al.  Measures of Association for Cross Classifications III: Approximate Sampling Theory , 1963 .

[8]  W. T. Williams,et al.  Dissimilarity Analysis: a new Technique of Hierarchical Sub-division , 1964, Nature.

[9]  K. Sparck Jones,et al.  KEYWORDS AND CLUMPS , 1964 .

[10]  Alfred G. Dale,et al.  A PROGRAMMING SYSTEM FOR AUTOMATIC CLASSIFICATION WITH APPLICATIONS IN LINGUISTIC AND INFORMATION RETRIEVAL RESEARCH , 1964 .

[11]  M. J. Rose,et al.  Classification of a set of elements , 1964, Comput. J..

[12]  James E. Dammann,et al.  A Technique for Determining and Coding Subclasses in Pattern Recognition Problems , 1965, IBM J. Res. Dev..

[13]  G. N. Lance,et al.  Computer programs for monothetic classification ("Association analysis") , 1965, Comput. J..

[14]  A W EDWARDS,et al.  A METHOD FOR CLUSTER ANALYSIS. , 1965, Biometrics.

[15]  P. MacNaughton-Smith Some statistical and other numerical techniques for classifying individuals , 1966 .

[16]  W. T. Williams,et al.  Multivariate Methods in Plant Ecology: VI. Comparison of Information-Analysis and Association-Analysis , 1966 .

[17]  W. T. Williams,et al.  Angiosperm taxonomy: a comparative study of some novel numerical techniques , 1966 .

[18]  W. T. Williams,et al.  Fundamental Problems in Numerical Taxonomy , 1966 .

[19]  W. T. Williams,et al.  A Generalized Sorting Strategy for Computer Classifications , 1966, Nature.

[20]  W. T. WILLIAMS,et al.  Concentration of Entries in Binary Arrays , 1966, Nature.

[21]  G. N. Lance,et al.  Computer Programs for Hierarchical Polythetic Classification ("Similarity Analyses") , 1966, Comput. J..

[22]  J. Gower A comparison of some methods of cluster analysis. , 1967, Biometrics.