Learning Adversarially Fair and Transferable Representations

In this paper, we advocate for representation learning as the key to mitigating unfair prediction outcomes downstream. Motivated by a scenario where learned representations are used by third parties with unknown objectives, we propose and explore adversarial representation learning as a natural method of ensuring those parties act fairly. We connect group fairness (demographic parity, equalized odds, and equal opportunity) to different adversarial objectives. Through worst-case theoretical guarantees and experimental validation, we show that the choice of this objective is crucial to fair prediction. Furthermore, we present the first in-depth experimental demonstration of fair transfer learning and demonstrate empirically that our learned representations admit fair predictions on new tasks while maintaining utility, an essential goal of fair representation learning.

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  J. Urgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992 .

[3]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[4]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[5]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[6]  Geoffrey E. Hinton Reducing the Dimensionality of Data with Neural , 2008 .

[7]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[8]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[9]  Cristian Sminchisescu,et al.  Semantic Segmentation with , 2012 .

[10]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[11]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Josep Domingo-Ferrer,et al.  Discrimination- and privacy-aware patterns , 2014, Data Mining and Knowledge Discovery.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[17]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[18]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[19]  Augustus Odena,et al.  Semi-Supervised Learning with Generative Adversarial Networks , 2016, ArXiv.

[20]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[21]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[22]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[23]  Camille Couprie,et al.  Semantic Segmentation using Adversarial Networks , 2016, NIPS 2016.

[24]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[25]  Guy N. Rothblum,et al.  Calibration for the (Computationally-Identifiable) Masses , 2017, ArXiv.

[26]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[27]  Katrina Ligett,et al.  Learning Fair Classifiers: A Regularization-Inspired Approach , 2017, ArXiv.

[28]  Zhe Zhao,et al.  Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations , 2017, ArXiv.

[29]  D. Elgesem,et al.  On fairness , 2017 .

[30]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[31]  Cheng Soon Ong,et al.  Provably Fair Representations , 2017, ArXiv.

[32]  Kush R. Varshney,et al.  Optimized Data Pre-Processing for Discrimination Prevention , 2017, ArXiv.

[33]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[34]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[35]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[36]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[37]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[38]  Toniann Pitassi,et al.  Predict Responsibly: Increasing Fairness by Learning To Defer , 2018, ICLR.