Hierarchical Clustering With Prototypes via Minimax Linkage

Agglomerative hierarchical clustering is a popular class of methods for understanding the structure of a dataset. The nature of the clustering depends on the choice of linkage—that is, on how one measures the distance between clusters. In this article we investigate minimax linkage, a recently introduced but little-studied linkage. Minimax linkage is unique in naturally associating a prototype chosen from the original dataset with every interior node of the dendrogram. These prototypes can be used to greatly enhance the interpretability of a hierarchical clustering. Furthermore, we prove that minimax linkage has a number of desirable theoretical properties; for example, minimax-linkage dendrograms cannot have inversions (unlike centroid linkage) and is robust against certain perturbations of a dataset. We provide an efficient implementation and illustrate minimax linkage’s strengths as a data analysis and visualization tool on a study of words from encyclopedia articles and on a dataset of images of human faces.

[1]  Sanjoy Dasgupta,et al.  Adaptive Control Processes , 2010, Encyclopedia of Machine Learning and Data Mining.

[2]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[3]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[4]  J. V. Ness,et al.  Admissible clustering procedures , 1971 .

[5]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[6]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[7]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[8]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Robert Tibshirani,et al.  A Framework for Feature Selection in Clustering , 2010, Journal of the American Statistical Association.

[10]  Robert Tibshirani,et al.  Hybrid hierarchical clustering with applications to microarray data. , 2005, Biostatistics.

[11]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[12]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[13]  Toshio Odanaka,et al.  ADAPTIVE CONTROL PROCESSES , 1990 .

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[16]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[17]  A. D. Gordon A Review of Hierarchical Classification , 1987 .

[18]  Bernhard Schölkopf,et al.  A Kernel Approach for Vector Quantization with Guaranteed Distortion Bounds , 2001, AISTATS.

[19]  Roberto Bellotti Hausdorff Clustering , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[21]  Sio Iong Ao,et al.  CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs , 2005, Bioinform..

[22]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..