Algorithms for the visualization of large and multivariate data sets

In this chapter we discuss algorithms for clustering and visualization of large and multivariate data. We describe an algorithm for exploratory data analysis which combines adaptive c-means clustering and multi-dimensional scaling (ACMDS). ACMDS is an algorithm for the online visualization of clustering processes and may be considered as an alternative approach to Kohonen's self organizing feature map (SOM). Whereas SOM is a heuristic neural network algorithm, ACMDS is derived from multivariate statistical algorithms. The implications of ACMMDS are illustrated through five different data sets.

[1]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[2]  G. Breithardt,et al.  Pathophysiological mechanisms and clinical significance of ventricular late potentials. , 1986, European heart journal.

[3]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[4]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[5]  I. Jolliffe Principal Component Analysis , 2002 .

[6]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[7]  John E. Moody,et al.  Fast adaptive k-means clustering: some empirical results , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[8]  H. Harman Modern factor analysis , 1961 .

[9]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[10]  M. Simson Use of Signals in the Terminal QRS Complex to Identify Patients with Ventricular Tachycardia After Myocardial Infarction , 1981, Circulation.

[11]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[12]  Aapo Hyvärinen,et al.  Survey on Independent Component Analysis , 1999 .

[13]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[14]  G. Breithardt,et al.  Standards for analysis of ventricular late potentials using high resolution or signal-averaged electrocardiography. A statement by a Task Force Committee between the European Society of Cardiology, the American Heart Association and the American College of Cardiology. , 1991, European heart journal.

[15]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[16]  E. B. Andersen,et al.  Modern factor analysis , 1961 .

[17]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[18]  S L Winters,et al.  The prognostic significance of quantitative signal-averaged variables relative to clinical variables, site of myocardial infarction, ejection fraction and ventricular premature beats: a prospective study. , 1989, Journal of the American College of Cardiology.

[19]  J. Friedman Exploratory Projection Pursuit , 1987 .

[20]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[21]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[22]  U. Kressel The Impact of the Learning–Set Size in Handwritten–Digit Recognition , 1991 .