Near-optimal sample complexity bounds for circulant binary embedding

Binary embedding is the problem of mapping points from a high-dimensional space to a Hamming cube in lower dimension while preserving pairwise distances. An efficient way to accomplish this is to make use of fast embedding techniques involving Fourier transform e.g. circulant matrices. While binary embedding has been studied extensively, theoretical results on fast binary embedding are rather limited. In this work, we build upon the recent literature to obtain significantly better dependencies on the problem parameters. A set of N points in ℝ<sup>n</sup> can be properly embedded into the Hamming cube {±1}<sup>k</sup> with δ distortion, by using k ∼ δ<sup>−3</sup> logN samples which is optimal in the number of points N and compares well with the optimal distortion dependency δ<sup>−2</sup>. Our optimal embedding result applies in the regime logN ≲ n<sup>1/3</sup>. Furthermore, if the looser condition logN ≲ √n holds, we show that all but an arbitrarily small fraction of the points can be optimally embedded. We believe the proposed techniques can be useful to obtain improved guarantees for other nonlinear embedding problems.

[1]  J. Bourgain,et al.  Invertibility of ‘large’ submatrices with applications to the geometry of Banach spaces and harmonic analysis , 1987 .

[2]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[3]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[4]  Guillermo Sapiro,et al.  Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy? , 2015, IEEE Transactions on Signal Processing.

[5]  Luo Si,et al.  Binary Codes Embedding for Fast Image Tagging with Incomplete Labels , 2014, ECCV.

[6]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[7]  Yaniv Plan,et al.  Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.

[8]  Constantine Caramanis,et al.  Binary Embedding: Fundamental Limits and Fast Algorithm , 2015, ICML.

[9]  J. Tropp On the conditioning of random subdictionaries , 2008 .

[10]  Laurent Jacques,et al.  Time for dithering: fast and quantized random embeddings via the restricted isometry property , 2016, ArXiv.

[11]  Shih-Fu Chang,et al.  Circulant Binary Embedding , 2014, ICML.

[12]  Yaniv Plan,et al.  Dimension Reduction by Random Hyperplane Tessellations , 2014, Discret. Comput. Geom..

[13]  Benjamin Recht,et al.  Near-Optimal Bounds for Binary Embeddings of Arbitrary Sets , 2015, ArXiv.

[14]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[15]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[16]  Aditya Bhaskara,et al.  On Binary Embedding using Circulant Matrices , 2015, J. Mach. Learn. Res..

[17]  Benjamin Recht,et al.  Isometric sketching of any set via the Restricted Isometry Property , 2015, ArXiv.

[18]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[19]  Laurent Jacques,et al.  A Quantized Johnson–Lindenstrauss Lemma: The Finding of Buffon’s Needle , 2013, IEEE Transactions on Information Theory.

[20]  Rachel Ward,et al.  New and Improved Johnson-Lindenstrauss Embeddings via the Restricted Isometry Property , 2010, SIAM J. Math. Anal..

[21]  Laurent Jacques,et al.  Robust 1-Bit Compressive Sensing via Binary Stable Embeddings of Sparse Vectors , 2011, IEEE Transactions on Information Theory.

[22]  Babak Hassibi,et al.  Sparse phase retrieval: Convex algorithms and limitations , 2013, 2013 IEEE International Symposium on Information Theory.

[23]  M. Rudelson,et al.  Hanson-Wright inequality and sub-gaussian concentration , 2013 .

[24]  Jon Kleinberg,et al.  Proceedings of the thirty-eighth annual ACM symposium on Theory of computing , 2006, STOC 2006.

[25]  Richard G. Baraniuk,et al.  1-Bit compressive sensing , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.

[26]  Xiaodong Li,et al.  Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.

[27]  ChangShih-Fu,et al.  On binary embedding using circulant matrices , 2017 .