On the duality between contrastive and non-contrastive self-supervised learning

Recent approaches in self-supervised learning of image representations can be categorized into different families of methods and, in particular, can be divided into contrastive and non-contrastive approaches. While differences between the two families have been thoroughly discussed to motivate new approaches, we focus more on the theoretical similarities between them. By designing contrastive and non-contrastive criteria that can be related algebraically and shown to be equivalent under limited assumptions, we show how close those families can be. We further study popular methods and introduce variations of them, allowing us to relate this theoretical result to current practices and show how design choices in the criterion can influence the optimization process and downstream performance. We also challenge the popular assumptions that contrastive and non-contrastive methods, respectively, need large batch sizes and output dimensions. Our theoretical and quantitative results suggest that the numerical gaps between contrastive and noncontrastive methods in certain regimes can be significantly reduced given better network design choice and hyperparameter tuning.

[1]  Yann LeCun,et al.  Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods , 2022, NeurIPS.

[2]  Jeff Z. HaoChen,et al.  Beyond Separability: Analyzing the Linear Transferability of Contrastive Representations to Related Subpopulations , 2022, NeurIPS.

[3]  Jeff Z. HaoChen,et al.  Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation , 2022, ICML.

[4]  Chaoning Zhang,et al.  Dual Temperature Helps Contrastive Learning Without Many Negative Samples: Towards Understanding and Simplifying MoCo , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jinjin Tian,et al.  Contrasting the landscape of contrastive and non-contrastive learning , 2022, AISTATS.

[6]  Teck Khim Ng,et al.  Mugs: A Multi-Granular Self-Supervised Learning Framework , 2022, ArXiv.

[7]  Yann LeCun,et al.  Neural Manifold Clustering and Embedding , 2022, ArXiv.

[8]  Razvan Pascanu,et al.  Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet? , 2022, ArXiv.

[9]  T. Furon,et al.  Watermarking Images in Self-Supervised Latent Spaces , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Jifeng Dai,et al.  Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tao Kong,et al.  iBOT: Image BERT Pre-Training with Online Tokenizer , 2021, ArXiv.

[12]  Yann LeCun,et al.  Understanding Dimensional Collapse in Contrastive Self-supervised Learning , 2021, ICLR.

[13]  Yann LeCun,et al.  Decoupled Contrastive Learning , 2021, ECCV.

[14]  John Canny,et al.  Compressive Visual Representations , 2021, NeurIPS.

[15]  Chunyuan Li,et al.  Efficient Self-supervised Vision Transformers for Representation Learning , 2021, ICLR.

[16]  Jeff Z. HaoChen,et al.  Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss , 2021, Neural Information Processing Systems.

[17]  Yann LeCun,et al.  VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning , 2021, ICLR.

[18]  Julien Mairal,et al.  Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Saining Xie,et al.  An Empirical Study of Training Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Yann LeCun,et al.  Barlow Twins: Self-Supervised Learning via Redundancy Reduction , 2021, ICML.

[21]  Yuandong Tian,et al.  Understanding self-supervised Learning Dynamics without Contrastive Pairs , 2021, ICML.

[22]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Nicu Sebe,et al.  Whitening for Self-Supervised Representation Learning , 2020, ICML.

[24]  Julien Mairal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[25]  Pierre H. Richemond,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[26]  Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere , 2020, ICML.

[27]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[28]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[29]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[31]  Yang You,et al.  Large Batch Training of Convolutional Networks , 2017, 1708.03888.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[34]  S. Li Concise Formulas for the Area and Volume of a Hyperspherical Cap , 2011 .

[35]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..