Transfer of Adversarial Robustness Between Perturbation Types

We study the transfer of adversarial robustness of deep neural networks between different perturbation types. While most work on adversarial examples has focused on $L_\infty$ and $L_2$-bounded perturbations, these do not capture all types of perturbations available to an adversary. The present work evaluates 32 attacks of 5 different types against models adversarially trained on a 100-class subset of ImageNet. Our empirical results suggest that evaluating on a wide range of perturbation sizes is necessary to understand whether adversarial robustness transfers between perturbation types. We further demonstrate that robustness against one perturbation type may not always imply and may sometimes hurt robustness against other perturbation types. In light of these results, we recommend evaluation of adversarial defenses take place on a diverse range of perturbation types and sizes.

[1]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Aleksander Madry,et al.  On Evaluating Adversarial Robustness , 2019, ArXiv.

[4]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[5]  Richard Shin JPEG-resistant Adversarial Images , 2017 .

[6]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[7]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[8]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[9]  Lujo Bauer,et al.  Adversarial Generative Nets: Neural Network Attacks on State-of-the-Art Face Recognition , 2018, ArXiv.

[10]  Aleksander Madry,et al.  Exploring the Landscape of Spatial Robustness , 2017, ICML.

[11]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[12]  Dan Boneh,et al.  Adversarial Training and Robustness for Multiple Perturbations , 2019, NeurIPS.

[13]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[14]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[15]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[16]  Alexandros G. Dimakis,et al.  Quantifying Perceptual Distortion of Adversarial Examples , 2019, ArXiv.

[17]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[18]  Pin-Yu Chen,et al.  Attacking the Madry Defense Model with L1-based Adversarial Examples , 2017, ICLR.

[19]  Jinfeng Yi,et al.  EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[20]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[21]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[22]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.