GTM: The Generative Topographic Mapping

Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis, which is based on a linear transformation between the latent space and the data space. In this article, we introduce a form of nonlinear latent variable model called the generative topographic mapping, for which the parameters of the model can be determined using the expectation-maximization algorithm. GTM provides a principled alternative to the widely used self-organizing map (SOM) of Kohonen (1982) and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multiphase oil pipeline.

[1]  D. Lawley VI.—The Estimation of Factor Loadings by the Method of Maximum Likelihood , 1940 .

[2]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[3]  R. P. McDonald,et al.  A general approach to nonlinear factor analysis , 1962 .

[4]  E. Nadaraya On Estimating Regression , 1964 .

[5]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[6]  K. Jöreskog Some contributions to maximum likelihood factor analysis , 1967 .

[7]  R. P. McDonald,et al.  Numerical methods for polynomial models in nonlinear factor analysis , 1967 .

[8]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[11]  David Lovelock,et al.  Tensors, differential forms, and variational principles , 1975 .

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  R. P. McDonald,et al.  The simultaneous estimation of factor loadings and scores , 1979 .

[14]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[15]  F. Krauss Latent Structure Analysis , 1980 .

[16]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[17]  B. Lindsay The Geometry of Mixture Likelihoods: A General Theory , 1983 .

[18]  R. P. McDonald,et al.  A second generation nonlinear factor analysis , 1983 .

[19]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[20]  IItevor Hattie Principal Curves and Surfaces , 1984 .

[21]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[22]  Richard Durbin,et al.  An analogue approach to the travelling salesman problem using an elastic net method , 1987, Nature.

[23]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[24]  Richard Szeliski,et al.  An Analysis of the Elastic Net Approach to the Traveling Salesman Problem , 1989, Neural Computation.

[25]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[26]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[27]  Stephen P. Luttrell,et al.  Derivation of a class of training algorithms , 1990, IEEE Trans. Neural Networks.

[28]  Nobuo Suga,et al.  Cortical computational maps for auditory imaging , 1990, Neural Networks.

[29]  P. Green On Use of the EM Algorithm for Penalized Likelihood Estimation , 1990 .

[30]  Geoffrey E. Hinton,et al.  Adaptive Elastic Models for Hand-Printed Character Recognition , 1991, NIPS.

[31]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[32]  G.R. De Haan,et al.  Links between self-organizing feature maps and weighted vector quantization , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[33]  F. O’Sullivan Discretized Laplacian Smoothing by Fourier Methods , 1991 .

[34]  Yann LeCun,et al.  Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[35]  R. Tibshirani Principal curves revisited , 1992 .

[36]  Tariq Samad,et al.  Self–organization with partial data , 1992 .

[37]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[38]  Radford M. Neal Bayesian training of backpropagation networks by the hybrid Monte-Carlo method , 1992 .

[39]  J. Kaas The functional organization of somatosensory cortex in primates. , 1993, Annals of anatomy = Anatomischer Anzeiger : official organ of the Anatomische Gesellschaft.

[40]  Geoffrey E. Hinton,et al.  Developing Population Codes by Minimizing Description Length , 1993, NIPS.

[41]  Radford M. Neal A new view of the EM algorithm that justifies incremental and other variants , 1993 .

[42]  Stephen M. Omohundro,et al.  Surface Learning with Applications to Lipreading , 1993, NIPS.

[43]  C. Bishop,et al.  Analysis of multiphase flows using dual-energy gamma densitometry and neural networks , 1993 .

[44]  Nanda Kambhatla,et al.  Fast Non-Linear Dimension Reduction , 1993, NIPS.

[45]  Joachim M. Buhmann,et al.  Vector quantization with complexity costs , 1993, IEEE Trans. Inf. Theory.

[46]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[47]  Helge Ritter,et al.  Parametrized Self-Organizing Maps , 1993 .

[48]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[49]  Geoffrey E. Hinton,et al.  Combining deformable models and neural networks for handprinted digit recognition , 1994 .

[50]  S. P. Luttrell,et al.  A Bayesian Analysis of Self-Organizing Maps , 1994, Neural Computation.

[51]  Geoffrey E. Hinton,et al.  Recognizing Handwritten Digits Using Mixtures of Linear Models , 1994, NIPS.

[52]  Geoffrey E. Hinton,et al.  Hand-printed digit recognition using deformable models , 1994 .

[53]  R. Tibshirani,et al.  Adaptive Principal Surfaces , 1994 .

[54]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[55]  Anil K. Jain,et al.  A nonlinear projection method based on Kohonen's topology preserving maps , 1995, IEEE Trans. Neural Networks.

[56]  Vladimir Cherkassky,et al.  Self-Organization as an Iterative Kernel Smoothing Process , 1995, Neural Computation.

[57]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[58]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[59]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[60]  Christopher M. Bishop,et al.  EM Optimization of Latent-Variables Density Models , 1995, NIPS.

[61]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[62]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[63]  D. Mackay,et al.  Bayesian neural networks and density networks , 1995 .

[64]  S. P. Luttrell Using self-organising maps to classify radar range profiles , 1995 .

[65]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[66]  Christopher M. Bishop,et al.  GTM: A Principled Alternative to the Self-Organizing Map , 1996, NIPS.

[67]  Jorma Laaksonen,et al.  SOM_PAK: The Self-Organizing Map Program Package , 1996 .

[68]  David J. C. MacKay Density Networks and their Application to Protein Modelling , 1996 .

[69]  Peter Dayan,et al.  Factor Analysis Using Delta-Rule Wake-Sleep Learning , 1997, Neural Computation.

[70]  Juha Karhunen,et al.  A Maximum Likelihood Approach to Nonlinear Blind Source Separation , 1997, ICANN.

[71]  Akio Utsugi Hyperparameter Selection for Self-Organizing Maps , 1997, Neural Computation.

[72]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[73]  A. Utsugi Topology selection for self-organizing maps , 1996 .

[74]  Geoffrey E. Hinton,et al.  Hierarchical Non-linear Factor Analysis and Topographic Maps , 1997, NIPS.

[75]  Geoffrey E. Hinton,et al.  Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[76]  Erkki Oja,et al.  S-Map: A Network with a Simple Self-Organization Algorithm for Generative Topographic Mappings , 1997, NIPS.

[77]  Christopher K. I. Williams,et al.  Magnification factors for the GTM algorithm , 1997 .

[78]  M. R. Rao,et al.  Combinatorial Optimization , 1992, NATO ASI Series.

[79]  Christopher K. I. Williams,et al.  Magnification factors for the SOM and GTM algorithms , 1997 .

[80]  Akio Utsugi,et al.  Density Estimation by Mixture Models with Smoothing Priors , 1998, Neural Computation.

[81]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[82]  Christopher M. Bishop,et al.  Developments of the generative topographic mapping , 1998, Neurocomputing.

[83]  Miguel Á. Carreira-Perpiñán,et al.  Dimensionality reduction of electropalatographic data using latent variable models , 1998, Speech Commun..

[84]  Andrzej Cichocki,et al.  A common neural-network model for unsupervised exploratory data analysis and independent component analysis , 1998, IEEE Trans. Neural Networks.

[85]  Ayoub Ghriss,et al.  Mixtures of Probabilistic Principal Component Analysers , 2018 .

[86]  Karin Haese,et al.  Self-organizing feature maps with self-adjusting learning parameters , 1998, IEEE Trans. Neural Networks.

[87]  Christopher M. Bishop,et al.  A Hierarchical Latent Variable Model for Data Visualization , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[88]  Klaus Obermayer,et al.  A Stochastic Self-Organizing Map for Proximity Data , 1999, Neural Computation.

[89]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[90]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[91]  Ward,et al.  Fermentation seed quality analysis with self-organising neural networks , 1999, Biotechnology and bioengineering.

[92]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[93]  Paulo J. G. Lisboa,et al.  Segmentation of the on-line shopping market using neural networks , 1999 .

[94]  Erkki Oja,et al.  An Experimental Comparison of Neural Algorithms for Independent Component Analysis and Blind Separation , 1999, Int. J. Neural Syst..

[95]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[96]  Dirk Husmeier,et al.  The Bayesian Evidence Scheme for Regularizing Probability-Density Estimating Neural Networks , 2000, Neural Computation.

[97]  Dan Cornford,et al.  Bayesian inference for wind field retrieval , 2000, Neurocomputing.

[98]  Miguel Á. Carreira-Perpiñán,et al.  Mode-Finding for Mixtures of Gaussian Distributions , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[99]  Klaus Holthausen,et al.  A General Probability Estimation Approach for Neural Computation , 2000, Neural Computation.

[100]  Yoh-Han Pao,et al.  Visualization and self-organization of multidimensional data through equalized orthogonal mapping , 2000, IEEE Trans. Neural Networks Learn. Syst..