Advances in Nonnegative Matrix and Tensor Factorization

Nonnegative matrix factorization (NMF) and its extension known as nonnegative tensor factorization (NTF) are emerging techniques that have been proposed recently. The goal of NMF/NTF is to decompose a nonnegative data matrix into a product of lower-rank nonnegative matrices or tensors (i.e., multiway arrays). An NMF approach similar to independent component analysis (ICA) or sparse component analysis (SCA) is very useful and promising for decomposing high-dimensional datasets into a lower-dimensional space. A great deal of interest has been given very recently to NMF models and techniques due to their capability of providing new insights and relevant information on the complex latent relationships in experimental datasets, and due to providing meaningful components with physical or physiological interpretations. For example, in bioinformatics, NMF and its extensions have been successfully applied to gene expression, sequence analysis, functional characterization of genes, clustering, and text mining. The main difference between NMF and other classical factorizations such as PCA, SCA, or ICA methods relies on the nonnegativity, and usually also additional constraints such as sparseness, smoothness, and/or orthogonality imposed on the models. These constraints tend to lead to a parts-based representation of the data, because they allow only additive, not subtractive, combinations of data items. In this way, the nonnegative components or factors produced by this approach can be interpreted as parts of the data. In other words, NMF yields nonnegative factors, which can be advantageous from the point of view of interpretability of the estimated components. Furthermore, in many real applications, data have a multiway (multiway array or tensor) structure. Exemplary data are video stream (rows, columns, RGB color coordinates, time), EEG in neuroscience (channels, frequency, time, samples, conditions, subjects), bibliographic text data (keywords, papers, authors, journals), and so on. Conventional methods preprocess multiway data, arranging them into a matrix. Recently, there has been a great deal of research on multiway analysis which conserves the original multiway structure of the data. The techniques have been shown to be very useful in a number of applications, such as signal separation, feature extraction, audio coding, speech classification, image compression, spectral clustering, neuroscience, and biomedical signal analysis. This special issue focuses on the most recent advances in NMF/NTF methods, with emphasis on the efforts made particularly by the researchers from the signal processing and neuroscience area. It reports novel theoretical results, efficient algorithms, and their applications. It also provides insight into current challenging areas, and identifies future research directions. This issue includes several important contributions which cover a wide range of approaches and techniques for NMF/NTF and their applications. These contributions are summarized as follows. The first paper, entitled “Probabilistic latent variable models as nonnegative factorizations” by M. Shashanka et al., presents a family of probabilistic latent variable models that can be used for analysis of nonnegative data. The paper shows that there are strong ties between NMF and this family, and provides some straightforward extensions which can help in dealing with shift invariances, higher-order decompositions, and sparsity constraints. Furthermore, it argues through these extensions that the use of this approach allows for rapid development of complex statistical models for analyzing nonnegative data. The second paper, entitled “Fast nonnegative matrix factorization algorithms using projected gradient approaches for large-scale problems” by R. Zdunek and A. Cichocki, investigates the applicability of projected gradient (PG) methods to NMF, based on the observation that the PG methods have high efficiency in solving large-scale convex minimization problems subject to linear constraints, since the minimization problems underlying NMF of large matrices well match this class of minimization problems. In particular, the paper has investigated several modified and adopted methods, including projected Landweber method, Barzilai-Borwein gradient projection, projected sequential subspace optimization, interior-point Newton algorithm, and sequential coordinatewise minimization algorithm, and compared their performance in terms of signal-to-interference ratio and elapsed time, using a simple benchmark of mixed partially dependent nonnegative signals. The third paper, entitled “Theorems on positive data: on the uniqueness of NMF” by H. Laurberg et al., investigates the conditions for which NMF is unique, and introduces several theorems which can determine whether the decomposition is in fact unique or not. Several examples are provided to show the use of the theorems and their limitations. The paper also shows that corruption of a unique NMF matrix by additive noise leads to a noisy estimation of the noise-free unique solution. Moreover, it uses a stochastic view of NMF to analyze which characterization of the underlying model will result in an NMF with small estimation errors. The fourth paper, entitled “Nonnegative matrix factorization with Gaussian process priors” by M. N. Schmidt and H. Laurberg, presents a general method for including prior knowledge in NMF, based on Gaussian process priors. It assumes that the nonnegative factors in the NMF are linked by a strictly increasing function to an underlying Gaussian process specified by its covariance function. The NMF decompositions are found to be in agreement with the prior knowledge of the distribution of the factors, such as sparseness, smoothness, and symmetries. The fifth paper, entitled “Extended nonnegative tensor factorisation models for musical sound source separation” by D. FitzGerald et al., presents a new additive synthesis-based NTF approach which allows the use of linear-frequency spectrograms as well as imposing strict harmonic constraints, resulting in an improved model as compared with some existing shift-invariant tensor factorization algorithms in which the use of log-frequency spectrograms to allow shift invariance in frequency causes problems when attempting to resynthesize the separated sources. The paper further studies the addition of a source filter model to the factorization framework, and presents an extended model which is capable of separating mixtures of pitched and percussive instruments simultaneously. The sixth paper, entitled “Gene tree labeling using nonnegative matrix factorization on biomedical literature” by K. E. Heinrich et al., addresses a challenging problem for biological applications, that is, identifying functional groups of genes. It examines the NMF technique for labeling hierarchical trees. It proposes a generic labeling algorithm as well as an evaluation technique, and discusses the effects of different NMF parameters with regard to convergence and labeling accuracy. The primary goals of this paper are to provide a qualitative assessment of the NMF and its various parameters and initialization, to provide an automated way to classify biomedical data, and to provide a method for evaluating labeled data assuming a static input tree. This paper also proposes a method for generating gold standard trees. The seventh paper, entitled “Single-trial decoding of bistable perception based on sparse nonnegative tensor decomposition” by Z. Wang et al., presents a sparse NTF-based method to extract features from the local field potential (LFP), collected from the middle temporal visual cortex in a macaque monkey, for decoding its bistable structure-from-motion perception. The advantages of the sparse NTF-based feature-extraction approach lie in its capability to yield components common across the space, time, and frequency domains, yet discriminative across different conditions without prior knowledge of the discriminating frequency bands and temporal windows for a specific subject. The results suggest that imposing the sparseness constraints on the NTF improves extraction of the gamma band feature which carries the most discriminative information for bistable perception. The eighth paper, entitled “Pattern expression nonnegative matrix factorization: algorithm and applications to blind source separation” by J. Zhang et al., presents a pattern expression NMF (PE-NMF) approach from the view point of using basis vectors most effectively to express patterns. Two regularization or penalty terms are introduced to be added to the original loss function of a standard NMF for effective expression of patterns with basis vectors in the PE-NMF. A learning algorithm is presented, and the convergence of the algorithm is proved theoretically. Three illustrative examples for blind source separation including heterogeneity correction for gene microarray data indicate that the sources can be successfully recovered with the proposed PE-NMF when the two parameters can be suitably chosen from prior knowledge of the problem. The last paper, entitled “Robust object recognition under partial occlusions using NMF” by D. Soukup and I. Bajla, studies NMF methods for recognition tasks with occluded objects. The paper analyzes the influence of sparseness on recognition rates for various dimensions of subspaces generated for two image databases, ORL face database, and USPS handwritten digit database. It also studies the behavior of four types of distances between a projected unknown image object and feature vectors in NMF subspaces generated for training data. In the recognition phase, partial occlusions in the test images have been modeled by putting two randomly large, randomly positioned black rectangles into each test image.

[1]  Michael W. Spratling Learning Image Components for Object Recognition , 2006, J. Mach. Learn. Res..

[2]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[3]  Xavier Rodet,et al.  Music Transcription with ISA and HMM , 2004, ICA.

[4]  D. Fitzgerald,et al.  Shifted non-negative matrix factorisation for sound source separation , 2005, IEEE/SP 13th Workshop on Statistical Signal Processing, 2005.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Clemens Brunner,et al.  Mu rhythm (de)synchronization and EEG single-trial classification of different motor imagery tasks , 2006, NeuroImage.

[7]  Jong-Hoon Ahn,et al.  MULTIPLE NONNEGATIVE-MATRIX FACTORIZATION OF DYNAMIC PET IMAGES , 2004 .

[8]  R. Freund,et al.  QMR: a quasi-minimal residual method for non-Hermitian linear systems , 1991 .

[9]  Lucas C. Parra,et al.  Nonnegative matrix factorization for rapid recovery of constituent spectra in magnetic resonance chemical shift imaging of the brain , 2004, IEEE Transactions on Medical Imaging.

[10]  Hualou Liang,et al.  Empirical mode decomposition of field potentials from macaque V4 in visual spatial attention , 2005, Biological Cybernetics.

[11]  Xavier Serra,et al.  Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[12]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[13]  Mark D. Plumbley,et al.  Theorems on Positive Data: On the Uniqueness of NMF , 2008, Comput. Intell. Neurosci..

[14]  Tamara G. Kolda,et al.  Categories and Subject Descriptors: G.4 [Mathematics of Computing]: Mathematical Software— , 2022 .

[15]  B. Shinn-Cunningham,et al.  Latent variable framework for modeling and separating single-channel acoustic sources , 2008 .

[16]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[17]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[18]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[19]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[20]  Bijan Pesaran,et al.  Temporal structure in neuronal activity during working memory in macaque parietal cortex , 2000, Nature Neuroscience.

[21]  Bhiksha Raj,et al.  Latent Dirichlet Decomposition for Single Channel Speaker Separation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[22]  Olivier Gascuel,et al.  Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle , 2002, J. Comput. Biol..

[23]  Lucas C. Parra,et al.  Recovery of constituent spectra using non-negative matrix factorization , 2003, SPIE Optics + Photonics.

[24]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[25]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[26]  Morten Mørup,et al.  Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation , 2006, ICA.

[27]  Mikkel N. Schmidt,et al.  Nonnegative Matrix Factorization with Gaussian Process Priors , 2008, Comput. Intell. Neurosci..

[28]  Christos Boutsidis,et al.  SVD based initialization: A head start for nonnegative matrix factorization , 2008, Pattern Recognit..

[29]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[30]  B. Raj,et al.  Latent variable decomposition of spectrograms for single channel speaker separation , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[31]  Roger B. Dannenberg,et al.  Remixing Stereo Music with Score-Informed Source Separation , 2006, ISMIR.

[32]  T. Adalı,et al.  Non-Negative Matrix Factorization with Orthogonality Constraints for Chemical Agent Detection in Raman Spectra , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[33]  V. P. Pauca,et al.  Object Characterization from Spectral Data Using Nonnegative Factorization and Information Theory , 2004 .

[34]  Richard S. Varga,et al.  Proof of Theorem 6 , 1983 .

[35]  C. Lanczos Solution of Systems of Linear Equations by Minimized Iterations1 , 1952 .

[36]  Seungjin Choi,et al.  Nonnegative features of spectro-temporal sounds for classification , 2005, Pattern Recognit. Lett..

[37]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[38]  Toshihisa Tanaka,et al.  First results on uniqueness of sparse non-negative matrix factorization , 2005, 2005 13th European Signal Processing Conference.

[39]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[40]  Fumikazu Miwakeichi,et al.  Decomposing EEG data into space–time–frequency components using Parallel Factor Analysis , 2004, NeuroImage.

[41]  Nanning Zheng,et al.  Nonnegative matrix factorization and its applications in pattern recognition , 2006 .

[42]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[44]  Michael Elad,et al.  Coordinate and subspace optimization methods for linear least squares with non-quadratic regularization , 2007 .

[45]  Stefania Bellavia,et al.  An interior point Newton‐like method for non‐negative least‐squares problems with degenerate solution , 2006, Numer. Linear Algebra Appl..

[46]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[47]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[48]  S. P. Levine,et al.  Spatiotemporal patterns of beta desynchronization and gamma synchronization in corticographic data during self-paced movement , 2003, Clinical Neurophysiology.

[49]  Hagit Shatkay,et al.  Discovering semantic features in the literature: a foundation for building functional associations , 2006, BMC Bioinformatics.

[50]  Björn Johansson,et al.  The application of an oblique-projected Landweber method to a model of supervised learning , 2006, Math. Comput. Model..

[51]  T. Brown,et al.  A new method for spectral decomposition using a bilinear Bayesian approach. , 1999, Journal of magnetic resonance.

[52]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[53]  P. Sajda,et al.  RECOVERY OF CONSTITUENT SPECTRA IN 3D CHEMICAL SHIFT IMAGING USING NON-NEGATIVE MATRIX FACTORIZATION , 2003 .

[54]  Éric Gaussier,et al.  Relation between PLSA and NMF and implications , 2005, SIGIR '05.

[55]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[56]  Raul Kompass,et al.  A Generalized Divergence Measure for Nonnegative Matrix Factorization , 2007, Neural Computation.

[57]  D. Fitzgerald,et al.  Towards an Inverse Constant Q Transform , 2006 .

[58]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[59]  Derry Fitzgerald,et al.  Shifted 2D Non-negative Tensor Factorisation , 2006 .

[60]  Mark D. Plumbley,et al.  Polyphonic music transcription by non-negative sparse coding of power spectra , 2004 .

[61]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[62]  Paris Smaragdis,et al.  Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs , 2004, ICA.

[63]  Andrzej Cichocki,et al.  Multilayer Nonnegative Matrix Factorization Using Projected Gradient Approaches , 2007, Int. J. Neural Syst..

[64]  N. Sidiropoulos,et al.  On the uniqueness of multilinear decomposition of N‐way arrays , 2000 .

[65]  Bhiksha Raj,et al.  Adobe Systems , 1998 .

[66]  Mark D. Plumbley Geometrical methods for non-negative ICA: Manifolds, Lie groups and toral subalgebras , 2005, Neurocomputing.

[67]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[68]  J. Larsen,et al.  Wind Noise Reduction using Non-Negative Sparse Coding , 2007, 2007 IEEE Workshop on Machine Learning for Signal Processing.

[69]  T. Poggio,et al.  Object Selectivity of Local Field Potentials and Spikes in the Macaque Inferior Temporal Cortex , 2006, Neuron.

[70]  Bhiksha Raj,et al.  Sparse Overcomplete Latent Variable Decomposition of Counts Data , 2007, NIPS.

[71]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[72]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[73]  Bhiksha Raj,et al.  A Probabilistic Latent Variable Model for Acoustic Modeling , 2006 .

[74]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[75]  Kevin Erich Heinrich,et al.  Finding Functional Gene Relationships Using the Semantic Gene Organizer (SGO) , 2004 .

[76]  Tuomas Virtanen,et al.  Separation of sound sources by convolutive sparse coding , 2004, SAPA@INTERSPEECH.

[77]  A. Parker,et al.  Perceptually Bistable Three-Dimensional Figures Evoke High Choice Probabilities in Cortical Area MT , 2001, The Journal of Neuroscience.

[78]  Lars Kai Hansen,et al.  Parallel Factor Analysis as an exploratory tool for wavelet transformed event-related EEG , 2006, NeuroImage.

[79]  Wenwu Wang,et al.  Squared Euclidean Distance Based Convolutive Non-Negative Matrix Factorization with Multiplicative Learning Rules For Audio Pattern Separation , 2007, 2007 IEEE International Symposium on Signal Processing and Information Technology.

[80]  Paris Smaragdis,et al.  Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[81]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[82]  K. H. Britten,et al.  A relationship between behavioral choice and the visual responses of neurons in macaque MT , 1996, Visual Neuroscience.

[83]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[84]  A. Cichocki,et al.  Multilayer nonnegative matrix factorisation , 2006 .

[85]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[86]  Derry Fitzgerald,et al.  Sound Source Separation Using Shifted Non-Negative Tensor Factorisation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[87]  J. Eggert,et al.  Transformation-invariant representation and NMF , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[88]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[89]  Irfan A. Essa,et al.  Incorporating Phase Information for Source Separation via Spectrogram Factorization , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[90]  Dietrich Lehmann,et al.  Nonsmooth nonnegative matrix factorization (nsNMF) , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  E. Mandelkow,et al.  Tau in Alzheimer's disease. , 1998, Trends in cell biology.

[92]  Irfan A. Essa,et al.  Estimating the Spatial Position of Spectral Components in Audio , 2006, ICA.

[93]  J. Nagy,et al.  Enforcing nonnegativity in image reconstruction algorithms , 2000, SPIE Optics + Photonics.

[94]  Khaled H. Hamed,et al.  Time-frequency analysis , 2003 .

[95]  Tuomas Virtanen,et al.  Sound Source Separation Using Sparse Coding with Temporal Continuity Objective , 2003, ICMC.

[96]  Michael A. Casey,et al.  Separation of Mixed Audio Sources By Independent Subspace Analysis , 2000, ICMC.

[97]  Victoria Stodden,et al.  When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[98]  Henk A. van der Vorst,et al.  Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..

[99]  J. Eggert,et al.  Sparse coding and NMF , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[100]  Bert F. Green,et al.  LATENT STRUCTURE ANALYSIS AND ITS RELATION TO FACTOR ANALYSIS , 1950 .

[101]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[102]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[103]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[104]  Oleg Okun,et al.  Fast Nonnegative Matrix Factorization and Its Application for Protein Fold Recognition , 2006, EURASIP J. Adv. Signal Process..

[105]  Inderjit S. Dhillon,et al.  Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data , 2004, SDM.

[106]  M. Karjalainen,et al.  Discrete-time modelling of musical instruments , 2005 .

[107]  Malcolm Slaney,et al.  Pattern Playback in the 90s , 1994, NIPS.

[108]  Christoph Schnörr,et al.  Controlling Sparseness in Non-negative Tensor Factorization , 2006, ECCV.

[109]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[110]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[111]  Amnon Shashua,et al.  A unifying approach to hard and probabilistic clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[112]  Dan Barry,et al.  Sound Source Separation: Azimuth Discrimination and Resynthesis , 2004 .

[113]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[114]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[115]  John F. Canny,et al.  GaP: a factor model for discrete data , 2004, SIGIR '04.

[116]  Josef Stoer,et al.  Numerische Mathematik 1 , 1989 .

[117]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[118]  Max Welling,et al.  Positive tensor factorization , 2001, Pattern Recognit. Lett..

[119]  Bhiksha Raj,et al.  Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.

[120]  T. Kolda Multilinear operators for higher-order decompositions , 2006 .

[121]  Michael W. Berry,et al.  Gene clustering by Latent Semantic Indexing of MEDLINE abstracts , 2005, Bioinform..

[122]  Kaare Brandt Petersen,et al.  Bayesian independent component analysis: Variational methods and non-negative decompositions , 2007, Digit. Signal Process..