Dissimilarity Analysis and Application to Visual Comparisons

In this chapter, the embedding of a set of data into a vector space is studied when an unconditional pairwise dissimilarity w between data is given. The vector space is endowed with a suitable pseudo-euclidean structure and the data embedding is built by extending the classical kernel principal component analysis. This embedding is unique, up to an isomorphism, and injective if and only if w separates the data. This construction takes advantage of axis corresponding to negative eigenvalues to develop pseudo-euclidean scatterplot matrix representations. This new visual tool is applied to compare various dissimilarities between hidden Markov models built from person’s faces.

[1]  Mohamed Slimane,et al.  Optimizing Hidden Markov Models with a Genetic Algorithm , 1995, Artificial Evolution.

[2]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[3]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[4]  Matti Vihola,et al.  Two dissimilarity measures for HMMS and their application in phoneme model clustering , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  J. Baker,et al.  The DRAGON system--An overview , 1975 .

[6]  Nicole Vincent,et al.  Web sites thematic classification using hidden Markov models , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Allan R. Wilks,et al.  Dynamic Graphics for Data Analysis , 1987 .

[8]  Mia Hubert,et al.  Robust PCA and classification in biosciences , 2004, Bioinform..

[9]  S. H. Cheng,et al.  A Modified Cholesky Algorithm Based on a Symmetric Indefinite Factorization , 1998, SIAM J. Matrix Anal. Appl..

[10]  Andreas Wierse,et al.  Information Visualization in Data Mining and Knowledge Discovery , 2001 .

[11]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[12]  Elizabeth Eskow,et al.  A Revised Modified Cholesky Factorization Algorithm , 1999, SIAM J. Optim..

[13]  M. Do Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models , 2003, IEEE Signal Processing Letters.

[14]  I. J. Schoenberg,et al.  Metric spaces and positive definite functions , 1938 .

[15]  Richard J. Harris A primer of multivariate statistics , 1975 .

[16]  W. Greub Linear Algebra , 1981 .

[17]  Concept tree based clustering visualization with shaded similarity matrices , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[18]  Mireille Boutin,et al.  On reconstructing n-point configurations from the distribution of distances or areas , 2003, Adv. Appl. Math..

[19]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[20]  Helge J. Ritter,et al.  On interactive visualization of high-dimensional data using the hyperbolic plane , 2002, KDD.

[21]  Markus Falkhausen,et al.  Calculation of distance measures between hidden Markov models , 1995, EUROSPEECH.

[22]  Lev Goldfarb,et al.  A unified approach to pattern recognition , 1984, Pattern Recognit..

[23]  Martin Charlton,et al.  An Investigation of Methods for Visualising Highly Multivariate Datasets , 1998 .

[24]  W. Torgerson,et al.  Multidimensional scaling of similarity , 1965, Psychometrika.

[25]  Graham J. Wills NicheWorks—Interactive Visualization of Very Large Graphs , 1999 .

[26]  Xuemin Lin,et al.  Spring algorithms and symmetry , 2000, Theor. Comput. Sci..

[27]  Klaus Obermayer,et al.  Classification on Pairwise Proximity Data , 1998, NIPS.

[28]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990 .

[30]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[31]  Yoram Singer,et al.  The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.

[32]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[33]  Sanne Engelen,et al.  A comparison of three procedures for robust PCA in high dimensions , 2016 .

[34]  David Haussler,et al.  Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families , 1993, ISMB.

[35]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[36]  Solomon Kullback,et al.  Information Theory and Statistics , 1970, The Mathematical Gazette.

[37]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[38]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[39]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[40]  Jarke J. van Wijk,et al.  HyperSlice - Visualization of Scalar Functions of Many Variables , 1993, IEEE Visualization.

[41]  Lalit R. Bahl,et al.  Design of a linguistic statistical decoder for the recognition of continuous speech , 1975, IEEE Trans. Inf. Theory.

[42]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..