论文信息 - Learning Deep Features in Instrumental Variable Regression - 字舞流文

Learning Deep Features in Instrumental Variable Regression

Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by using an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and stage 2 performs linear regression from the treatment to the outcome, conditioned on the instrument. We propose a novel method, deep feature instrumental variable regression (DFIV), to address the case where relations between instruments, treatments, and outcomes may be nonlinear. In this case, deep neural nets are trained to define informative nonlinear features on the instruments and treatments. We propose an alternating training regime for these features to ensure good end-to-end performance when composing stages 1 and 2, thus obtaining highly flexible feature maps in a computationally efficient manner. DFIV outperforms recent state-of-the-art methods on challenging IV benchmarks, including settings involving high dimensional image data. DFIV also exhibits competitive performance in off-policy policy evaluation for reinforcement learning, which can be understood as an IV regression task.

Nando de Freitas | Liyuan Xu | Yutian Chen | Siddarth Srinivasan | Arnaud Doucet | Arthur Gretton

[1] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[2] Andrew Bennett,et al. Deep Generalized Method of Moments for Instrumental Variable Analysis , 2019, NeurIPS.

[3] Andrew G. Barto,et al. Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.

[4] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[5] Kevin Leyton-Brown,et al. Deep IV: A Flexible Approach for Counterfactual Prediction , 2017, ICML.

[6] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.

[7] Krikamol Muandet,et al. Dual IV: A Single Stage Instrumental Variable Regression , 2019, ArXiv.

[8] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.

[9] Nando de Freitas,et al. Hyperparameter Selection for Offline Reinforcement Learning , 2020, ArXiv.

[10] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[11] Jieping Ye,et al. Environment Reconstruction with Hidden Confounders for Reinforcement Learning based Recommendation , 2019, KDD.

[12] Arthur Gretton,et al. Kernel Instrumental Variable Regression , 2019, NeurIPS.

[13] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[14] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[15] Joshua D. Angrist,et al. Identification of Causal Effects Using Instrumental Variables , 1993 .

[16] Elias Bareinboim,et al. Causal Inference by Surrogate Experiments: z-Identifiability , 2012, UAI.

[17] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.

[18] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[19] Yoav Freund,et al. Foundations of Machine Learning , 2012 .

[20] Emma Brunskill,et al. Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding , 2020, NeurIPS.

[21] Anton Nakov. Jackknife instrumental variables estimation: replication and extension of angrist, imbens and krueger (1999) , 2010 .

[22] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.