Log-Determinant Divergences Revisited: Alpha-Beta and Gamma Log-Det Divergences

In this paper, we review and extend a family of log-det divergences for symmetric positive definite (SPD) matrices and discuss their fundamental properties. We show how to generate from parameterized Alpha-Beta (AB) and Gamma Log-det divergences many well known divergences, for example, the Stein's loss, S-divergence, called also Jensen-Bregman LogDet (JBLD) divergence, the Logdet Zero (Bhattacharryya) divergence, Affine Invariant Riemannian Metric (AIRM) as well as some new divergences. Moreover, we establish links and correspondences among many log-det divergences and display them on alpha-beta plain for various set of parameters. Furthermore, this paper bridges these divergences and shows also their links to divergences of multivariate and multiway Gaussian distributions. Closed form formulas are derived for gamma divergences of two multivariate Gaussian densities including as special cases the Kullback-Leibler, Bhattacharryya, R\'enyi and Cauchy-Schwartz divergences. Symmetrized versions of the log-det divergences are also discussed and reviewed. A class of divergences is extended to multiway divergences for separable covariance (precision) matrices.

[1]  Anoop Cherian,et al.  Riemannian Sparse Coding for Positive Definite Matrices , 2014, ECCV.

[2]  Bart De Moor On the structure and geometry of the product singular value decomposition , 1989 .

[3]  Julie Josse,et al.  Adaptive shrinkage of singular values , 2013, Statistics and Computing.

[4]  S. Amari,et al.  Information geometry of divergence functions , 2010 .

[5]  Michael M. Wolf,et al.  Hilbert's projective metric in quantum information theory , 2011, 1102.5170.

[6]  Ren-Cang Li Rayleigh Quotient Based Optimization Methods For Eigenvalue Problems , 2014 .

[7]  Frank Nielsen,et al.  A closed-form expression for the Sharma–Mittal entropy of exponential families , 2011, ArXiv.

[8]  Sergio Cruces,et al.  Generalized Alpha-Beta Divergences and Their Application to Robust Nonnegative Matrix Factorization , 2011, Entropy.

[9]  Pierre Dutilleul,et al.  Maximum likelihood estimation for the tensor normal distribution: Algorithm, minimum sample size, and empirical bias and dispersion , 2013, J. Comput. Appl. Math..

[10]  Michèle Basseville,et al.  Divergence measures for statistical data processing - An annotated bibliography , 2013, Signal Process..

[11]  Jun Zhang,et al.  Divergence Function, Duality, and Convex Analysis , 2004, Neural Computation.

[12]  Andrzej Cichocki,et al.  Csiszár's Divergences for Non-negative Matrix Factorization: Family of New Algorithms , 2006, ICA.

[13]  David J. Fleet,et al.  Computer Vision – ECCV 2014 , 2014, Lecture Notes in Computer Science.

[14]  R. Bhatia Positive Definite Matrices , 2007 .

[15]  C. R. Rao,et al.  Entropy differential metric, distance and divergence measures in probability spaces: A unified approach , 1982 .

[16]  Shun-ichi Amari,et al.  Information Geometry and Its Applications: Convex Function and Dually Flat Manifold , 2009, ETVC.

[17]  Shun-ichi Amari,et al.  $\alpha$ -Divergence Is Unique, Belonging to Both $f$-Divergence and Bregman Divergence Classes , 2009, IEEE Transactions on Information Theory.

[18]  Maher Moakher,et al.  Symmetric Positive-Definite Matrices: From Geometry to Applications and Visualization , 2006, Visualization and Processing of Tensor Fields.

[19]  Shun-ichi Amari Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure , 2014, Entropy.

[20]  S. Eguchi,et al.  Robust parameter estimation with a small bias against heavy contamination , 2008 .

[21]  Frank Nielsen,et al.  Jensen Divergence-Based Means of SPD Matrices , 2013 .

[22]  Anoop Cherian,et al.  Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet Divergence , 2011, 2011 International Conference on Computer Vision.

[23]  Maher Moakher,et al.  Means of Hermitian positive-definite matrices based on the log-determinant α-divergence function , 2012 .

[24]  Anoop Cherian,et al.  Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Dietrich von Rosen,et al.  The multilinear normal distribution: Introduction and some basic properties , 2013, J. Multivar. Anal..

[26]  Inderjit S. Dhillon,et al.  Differential Entropic Clustering of Multivariate Gaussians , 2006, NIPS.

[27]  S. Sra Positive definite matrices and the S-divergence , 2011, 1110.1773.

[28]  Sejong Kim,et al.  Factorizations of invertible density matrices , 2014 .

[29]  Inderjit S. Dhillon,et al.  Learning low-rank kernel matrices , 2006, ICML.

[30]  Mehrtash Tafazzoli Harandi,et al.  Bregman Divergences for Infinite Dimensional Covariance Matrices , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Peter D. Hoff,et al.  Separable covariance arrays via the Tucker product, with applications to multivariate relational data , 2010, 1008.2169.

[32]  David L. Donoho,et al.  Optimal Shrinkage of Singular Values , 2014, IEEE Transactions on Information Theory.

[33]  Vittorio Murino,et al.  Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces , 2014, NIPS.

[34]  Peter D. Hoff,et al.  Equivariant minimax dominators of the MLE in the array normal model , 2015, J. Multivar. Anal..

[35]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[36]  Arjun K. Gupta,et al.  Array Variate Random Variables with Multiway Kro- necker Delta Covariance Matrix Structure , 2011 .

[37]  安藤 毅 Majorization, doubly stochastic matrices and comparison of eigenvalues , 1982 .

[38]  Matthias Bethge,et al.  Statistical inference with the Elliptical Gamma Distribution , 2014 .

[39]  Pradeep Ravikumar,et al.  BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables , 2013, NIPS.

[40]  Frank P. Ferrie,et al.  Modified Divergences for Gaussian Densities , 2012, SSPR/SPR.

[41]  Dominik Olszewski,et al.  Asymmetric clustering using the alpha-beta divergence , 2014, Pattern Recognit..

[42]  Frank Nielsen,et al.  Matrix Information Geometry , 2012 .

[43]  Deniz Akdemir Array Variate Skew Normal Random Variables with Multiway Kronecker Delta Covariance Matrix Structure , 2011 .

[44]  Rama Chellappa,et al.  From sample similarity to ensemble similarity: probabilistic distance measures in reproducing kernel Hilbert space , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  S. Sra Positive definite matrices and the Symmetric Stein Divergence , 2011 .

[46]  Suvrit Sra,et al.  A new metric on the manifold of kernel matrices with application to matrix geometric means , 2012, NIPS.

[47]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[48]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.