A partial orthogonalization method for simulating covariance and concentration graph matrices

Structure learning methods for covariance and concentration graphs are often validated on synthetic models, usually obtained by randomly generating: (i) an undirected graph, and (ii) a compatible symmetric positive definite (SPD) matrix. In order to ensure positive definiteness in (ii), a dominant diagonal is usually imposed. However, the link strengths in the resulting graphical model, determined by off-diagonal entries in the SPD matrix, are in many scenarios extremely weak. Recovering the structure of the undirected graph thus becomes a challenge, and algorithm validation is notably affected. In this paper, we propose an alternative method which overcomes such problem yet yielding a compatible SPD matrix. We generate a partially row-wise-orthogonal matrix factor, where pairwise orthogonal rows correspond to missing edges in the undirected graph. In numerical experiments ranging from moderately dense to sparse scenarios, we obtain that, as the dimension increases, the link strength we simulate is stable with respect to the structure sparsity. Importantly, we show in a real validation setting how structure recovery is greatly improved for all learning algorithms when using our proposed method, thereby producing a more realistic comparison framework.

[1]  Anne-Laure Boulesteix,et al.  Regularized estimation of large-scale gene association networks using graphical Gaussian models , 2009, BMC Bioinformatics.

[2]  Alex Lenkoski,et al.  A direct sampler for G‐Wishart variates , 2013, 1304.1350.

[3]  Kshitij Khare,et al.  Wishart distributions for decomposable covariance graph models , 2011, 1103.1768.

[4]  Aki Vehtari,et al.  Bayesian Estimation of Gaussian Graphical Models with Projection Predictive Selection , 2018 .

[5]  A. Dawid Conditional Independence for Statistical Operations , 1980 .

[6]  Veljko M. Milutinovic,et al.  Fast Sparse Gaussian Markov Random Fields Learning Based on Cholesky Factorization , 2017, IJCAI.

[7]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[8]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[9]  Olivier Ledoit,et al.  Nonlinear Shrinkage Estimation of Large-Dimensional Covariance Matrices , 2011, 1207.5322.

[10]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[11]  H. Zou,et al.  Regularized rank-based estimation of high-dimensional nonparanormal graphical models , 2012, 1302.3082.

[12]  S. T. Jensen,et al.  Covariance Hypotheses Which are Linear in Both the Covariance and the Inverse Covariance , 1988 .

[13]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[14]  M. West,et al.  Simulation of hyper-inverse Wishart distributions in graphical models , 2007 .

[15]  M. Rudelson,et al.  The smallest singular value of a random rectangular matrix , 2008, 0802.3956.

[16]  Dimitris Samaras,et al.  Variable Selection for Gaussian Graphical Models , 2012, AISTATS.

[17]  G'erard Letac,et al.  Wishart distributions for decomposable graphs , 2007, 0708.2380.

[18]  G. Kauermann On a dualization of graphical Gaussian models , 1996 .

[19]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[20]  Ben Taskar,et al.  Learning Sparse Markov Network Structure via Ensemble-of-Trees Models , 2009, AISTATS.

[21]  A. Dawid,et al.  Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models , 1993 .

[22]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[23]  Seung-Jean Kim,et al.  Condition‐number‐regularized covariance estimation , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[24]  N. Wermuth,et al.  Linear Dependencies Represented by Chain Graphs , 1993 .

[25]  Thomas S. Richardson,et al.  Graphical Methods for Efficient Likelihood Inference in Gaussian Covariance Models , 2007, J. Mach. Learn. Res..

[26]  Paul Erdös,et al.  On random graphs, I , 1959 .