Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds

We study the following generalized matrix rank estimation problem: given an n × n matrix and a constant c ≥ 0, estimate the number of eigenvalues that are greater than c. In the distributed setting, the matrix of interest is the sum of m matrices held by separate machines. We show that any deterministic algorithm solving this problem must communicate Ω(n2) bits, which is order equivalent to transmitting the whole matrix. In contrast, we propose a randomized algorithm that communicates only e O(n) bits. The upper bound is matched by an Ω(n) lower bound on the randomized communication complexity. We demonstrate the practical effectiveness of the proposed algorithm with some numerical experiments.

[1]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[2]  C. W. Clenshaw A note on the summation of Chebyshev series , 1955 .

[3]  T. J. Rivlin The Chebyshev polynomials , 1974 .

[4]  Harold Abelson,et al.  Lower bounds on information transfer in distributed computations , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[5]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[6]  Zvi Galil,et al.  Lower bounds on communication complexity , 1984, STOC '84.

[7]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[8]  Georg Schnitger,et al.  The communication complexity of several problems in matrix computation , 1989, SPAA '89.

[9]  John N. Tsitsiklis,et al.  On the communication complexity of distributed algebraic computation , 1993, JACM.

[10]  R. Bhatia Matrix Analysis , 1996 .

[11]  V. Buldygin,et al.  Metric characterization of random variables and random processes , 2000 .

[12]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[13]  T. Sakurai,et al.  A projection method for generalized eigenvalue problems using numerical integration , 2003 .

[14]  Sanjoy Dasgupta,et al.  An elementary proof of a theorem of Johnson and Lindenstrauss , 2003, Random Struct. Algorithms.

[15]  Georg Schnitger,et al.  Communication complexity of matrix computation over finite fields , 1995, Mathematical systems theory.

[16]  J. Dicapua Chebyshev Polynomials , 2019, Fibonacci and Lucas Numbers With Applications.

[17]  David P. Woodruff,et al.  Numerical linear algebra in the streaming model , 2009, STOC '09.

[18]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[19]  Eric Polizzi,et al.  A Density Matrix-based Algorithm for Solving Eigenvalue Problems , 2009, ArXiv.

[20]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[21]  Michael W. Mahoney Boyd,et al.  Randomized Algorithms for Matrices and Data , 2010 .

[22]  A. Razborov Communication Complexity , 2011 .

[23]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[24]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[25]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[26]  Yousef Saad,et al.  A spectrum slicing method for the Kohn-Sham problem , 2012, Comput. Phys. Commun..

[27]  Xiaoming Sun,et al.  Randomized Communication Complexity for Linear Algebra Problems over Finite Fields , 2012, STACS.

[28]  David P. Woodruff,et al.  An Optimal Lower Bound for Distinct Elements in the Message Passing Model , 2014, SODA.

[29]  Santosh S. Vempala,et al.  Principal Component Analysis and Higher Correlations for Distributed Data , 2013, COLT.

[30]  David P. Woodruff,et al.  On the Communication Complexity of Linear Algebraic Problems in the Message Passing Model , 2014, DISC.

[31]  Edoardo Di Napoli,et al.  Efficient estimation of eigenvalue counts in an interval , 2013, Numer. Linear Algebra Appl..