Semisupervised learning of hierarchical latent trait models for data visualization

Recently, we have developed the hierarchical generative topographic mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. We propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the latent trait model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest", whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets.

[1]  Sougata Mukherjea,et al.  Glyphmaker: creating customized visualizations of complex data , 1994, Computer.

[2]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[3]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[5]  Risto Miikkulainen,et al.  Script Recognition with Hierarchical Feature Maps , 1992 .

[6]  Gilles Celeux,et al.  A Component-Wise EM Algorithm for Mixtures , 2001, 1201.5913.

[7]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[8]  J. F. C. Kingman,et al.  Information and Exponential Families in Statistical Theory , 1980 .

[9]  Paul Horton,et al.  A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins , 1996, ISMB.

[10]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[11]  David L. Dowe,et al.  MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions , 2000, Stat. Comput..

[12]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[13]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[14]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[15]  Gerhard Widmer,et al.  Visualizing changes in the structure of data for exploratory feature selection , 2003, KDD '03.

[16]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  Matthew O. Ward,et al.  Exploring N-dimensional databases , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[19]  C. Stein Approximation of Improper Prior Measures by Prior Probability Measures , 1965 .

[20]  Risto Mukkulainen,et al.  Script Recognition with Hierarchical Feature Maps , 1990 .

[21]  Christopher K. I. Williams,et al.  Magnification factors for the GTM algorithm , 1997 .

[22]  David Harel,et al.  A two-way visualization method for clustered data , 2003, KDD '03.

[23]  R. Wolke,et al.  Iteratively Reweighted Least Squares: Algorithms, Convergence Analysis, and Numerical Comparisons , 1988 .

[24]  C. S. Wallace,et al.  An Information Measure for Hierarchic Classification , 1973, Comput. J..

[25]  Christopher M. Bishop,et al.  Developments of the generative topographic mapping , 1998, Neurocomputing.

[26]  O. Barndorff-Nielsen Information And Exponential Families , 1970 .

[27]  Stephen J. Roberts,et al.  Minimum-Entropy Data Partitioning Using Reversible Jump Markov Chain Monte Carlo , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[29]  Hans-Peter Kriegel,et al.  Recursive pattern: a technique for visualizing very large amounts of data , 1995, Proceedings Visualization '95.

[30]  David L. Dowe,et al.  Intrinsic classification by MML - the Snob program , 1994 .

[31]  David L. Dowe,et al.  Refinements of MDL and MML Coding , 1999, Comput. J..

[32]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990 .

[33]  Matthew O. Ward,et al.  Interactive hierarchical displays: a general framework for visualization and exploration of large multivariate data sets , 2003, Comput. Graph..

[34]  Peter Tiño,et al.  A General Framework for a Principled Hierarchical Visualization of Multivariate Data , 2002, IDEAL.

[35]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[36]  Peter Tiño,et al.  Hierarchical GTM: Constructing Localized Nonlinear Projection Manifolds in a Principled Way , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Ata Kabán,et al.  A Combined Latent Class and Trait Model for the Analysis and Visualization of Discrete Data , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[39]  Christopher M. Bishop,et al.  A Hierarchical Latent Variable Model for Data Visualization , 1998, IEEE Trans. Pattern Anal. Mach. Intell..