Finding Clusters and Components by Unsupervised Learning

We present a tutorial survey on some recent approaches to unsupervised machine learning in the context of statistical pattern recognition. In statistical PR, there are two classical categories for unsupervised learning methods and models: first, variations of Principal Component Analysis and Factor Analysis, and second, learning vector coding or clustering methods. These are the starting-point in this article. The more recent trend in unsupervised learning is to consider this problem in the framework of probabilistic generative models. If it is possible to build and estimate a model that explains the data in terms of some latent variables, key insights may be obtained into the true nature and structure of the data. This approach is also reviewed, with examples such as linear and nonlinear independent component analysis and topological maps.

[1]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[2]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[3]  Erkki Oja,et al.  The nonlinear PCA learning rule in independent component analysis , 1997, Neurocomputing.

[4]  J. Karhunen,et al.  Advances in Nonlinear Blind Source Separation , 2003 .

[5]  E. Oja,et al.  Nonlinear Blind Source Separation by Variational Bayesian Learning , 2003, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[6]  Roman Bek,et al.  Discourse on one way in which a quantum-mechanics language on the classical logical base can be built up , 1978, Kybernetika.

[7]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[8]  J. Rubner,et al.  A Self-Organizing Network for Principal-Component Analysis , 1989 .

[9]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[10]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[11]  Geoffrey E. Hinton,et al.  Recognizing Handwritten Digits Using Mixtures of Linear Models , 1994, NIPS.

[12]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[13]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[14]  Harri Valpola,et al.  Bayesian Ensemble Learning for Nonlinear Factor Analysis , 2000 .

[15]  Dinh-Tuan Pham,et al.  Blind separation of instantaneous mixtures of nonstationary sources , 2001, IEEE Trans. Signal Process..

[16]  Erkki Oja,et al.  Nonlinear dynamical factor analysis for state change detection , 2004, IEEE Transactions on Neural Networks.

[17]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[18]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[19]  Erkki Oja,et al.  A class of neural networks for independent component analysis , 1997, IEEE Trans. Neural Networks.

[20]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[21]  Ralf Der,et al.  Second-Order Learning in Self-Organizing Maps , 1999 .

[22]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[23]  DeLiang Wang,et al.  Unsupervised Learning: Foundations of Neural Computation , 2001, AI Mag..

[24]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[25]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[26]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[27]  E. B. Andersen,et al.  Modern factor analysis , 1961 .

[28]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[29]  Erkki Oja,et al.  Kohonen Maps , 1999, Encyclopedia of Machine Learning.

[30]  T. Ens,et al.  Blind signal separation : statistical principles , 1998 .

[31]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[32]  M. G. Kendall,et al.  The advanced theory of statistics. Vols. 2. , 1969 .

[33]  P. Foldiak,et al.  Adaptive network for optimal linear feature extraction , 1989, International 1989 Joint Conference on Neural Networks.

[34]  Andrzej Cichocki,et al.  Robust neural networks with on-line learning for blind identification and blind separation of sources , 1996 .

[35]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[36]  Helge J. Ritter,et al.  Neural computation and self-organizing maps - an introduction , 1992, Computation and neural systems series.

[37]  Kurt Hornik,et al.  Learning in linear neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[38]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[39]  Nathalie Japkowicz,et al.  Nonlinear Autoassociation Is Not Equivalent to PCA , 2000, Neural Computation.

[40]  Robert J. Schalkoff,et al.  Pattern recognition - statistical, structural and neural approaches , 1991 .

[41]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[42]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[43]  C. Gielen,et al.  Neural computation and self-organizing maps, an introduction , 1993 .

[44]  Erkki Oja,et al.  Neural fitting: Robustness by anti-Hebbian learning , 1996, Neurocomputing.

[45]  Eric Moulines,et al.  A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..

[46]  Lei Xu,et al.  Temporal BYY learning for state space approach, hidden Markov model, and blind source separation , 2000, IEEE Trans. Signal Process..

[47]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[48]  Juha Karhunen,et al.  Principal component neural networks — Theory and applications , 1998, Pattern Analysis and Applications.

[49]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  T. Villmann,et al.  Topology Preservation in Self-Organizing Maps , 1999 .

[51]  T. Heskes Energy functions for self-organizing maps , 1999 .

[52]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[53]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[54]  Juha Karhunen,et al.  A Unified Neural Bigradient Algorithm for robust PCA and MCA , 1996, Int. J. Neural Syst..

[55]  D. Chakrabarti,et al.  A fast fixed - point algorithm for independent component analysis , 1997 .

[56]  C. Malsburg Self-organization of orientation sensitive cells in the striate cortex , 2004, Kybernetik.