Entropie causality anc greedy minimum entropy coupling

We study the problem of identifying the causal relationship between two discrete random variables from observational data. We recently proposed a novel framework called entropie causality that works in a very general functional model but makes the assumption that the unobserved exogenous variable has small entropy in the true causal direction. This framework requires the solution of a minimum entropy coupling problem: Given marginal distributions of m discrete random variables, each on n states, find the joint distribution with minimum entropy, that respects the given marginals. This corresponds to minimizing a concave function of nm variables over a convex polytope defined by nm linear constraints, called a transportation polytope. Unfortunately, it was recently shown that this minimum entropy coupling problem is NP-hard, even for 2 variables with n states. Even representing points (joint distributions) over this space can require exponential complexity (in n, m) if done naively. In our recent work we introduced an efficient greedy algorithm to find an approximate solution for this problem. In this paper we analyze this algorithm and establish two results: that our algorithm always finds a local minimum and also is within an additive approximation error from the unknown global optimum.

[1]  Bernhard Schölkopf,et al.  Causal Inference on Discrete Data Using Additive Noise Models , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Alexandros G. Dimakis,et al.  Learning Causal Graphs with Small Interventions , 2015, NIPS.

[3]  Mathias Frisch,et al.  Causation and intervention , 2014 .

[4]  Bernhard Schölkopf,et al.  Probabilistic latent variable models for distinguishing between cause and effect , 2010, NIPS.

[5]  Alexandros G. Dimakis,et al.  Entropic Causal Inference , 2016, AAAI.

[6]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[7]  Ioannis Kontoyiannis,et al.  Estimating the Directed Information and Testing for Causality , 2015, IEEE Transactions on Information Theory.

[8]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[9]  Pietro Perona,et al.  Unsupervised Discovery of El Nino Using Causal Feature Learning on Microlevel Climate Data , 2016, UAI.

[10]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[11]  Bernhard Schölkopf,et al.  Information-geometric approach to inferring causal directions , 2012, Artif. Intell..

[12]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[13]  Edward D. Kim,et al.  Combinatorics and geometry of transportation polytopes: An update , 2013, Discrete Geometry and Algebraic Combinatorics.

[14]  Alexandros G. Dimakis,et al.  Learning Causal Graphs with Constraints , 2016 .

[15]  Mladen Kovacevic,et al.  On the hardness of entropy minimization and related problems , 2012, 2012 IEEE Information Theory Workshop.

[16]  Bernhard Schölkopf,et al.  Identification of causal relations in neuroimaging data with latent confounders: An instrumental variable approach , 2016, NeuroImage.

[17]  Russell A. Poldrack,et al.  Six problems for causal inference from fMRI , 2010, NeuroImage.

[18]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[19]  Todd P. Coleman,et al.  Directed Information Graphs , 2012, IEEE Transactions on Information Theory.

[20]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[21]  Luisa Gargano,et al.  How to find a joint probability distribution of minimum entropy (almost) given the marginals , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).