Randomized approximation algorithms for set multicover problems with applications to reverse engineering of protein and gene networks

In this paper we investigate the computational complexities of a combinatorial problem that arises in the reverse engineering of protein and gene networks. Our contributions are as follows: - We abstract a combinatorial version of the problem and observe that this is equivalent to the set multicover problem when the coverage factor k is a function of the number of elements n of the universe. An important special case for our application is the case in which k = n - 1. - We observe that the standard greedy algorithm produces an approximation ratio of Ω(log n) even if k is large i.e. k = n - c for some constant c > 0. - Let 1 < a < n denotes the maximum number of elements in any given set in our set multicover problem. Then, we show that a non-trivial analysis of a simple randomized polynomial-time approximation algorithm for this problem yields an expected approximation ratio E[r(a, k)] that is an increasing function of a/k. The behavior of E[r(a, k)] is roughly as follows: it is about ln(a/k) when a/k is at least about e 2 ≃ 7.39, and for smaller values of a/k it decreases towards 2 exponentially with increasing k with lim α-κ→0 E[r(a, k)] < 2. Our randomized algorithm is a cascade of a deterministic and a randomized rounding step parameterized by a quantity β followed by a greedy solution for the remaining problem.

[1]  L. Schläfli Theorie der vielfachen Kontinuität , 1901 .

[2]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[3]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[4]  H. Schwarz Gesammelte mathematische Abhandlungen , 1970 .

[5]  David S. Johnson,et al.  Approximation algorithms for combinatorial problems , 1973, STOC.

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[8]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[9]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[10]  Ran Raz,et al.  A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP , 1997, STOC '97.

[11]  U. Feige A threshold of ln n for approximating set cover , 1998, JACM.

[12]  Eduardo Sontag VC dimension of neural networks , 1998 .

[13]  Eduardo Sontag,et al.  Untangling the wires: A strategy to trace functional interactions in signaling and gene networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Eduardo Sontag,et al.  Determination of Functional Network Structure from Local Parameter Dependence Data , 2002, physics/0205003.

[15]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[16]  R. Callard,et al.  From the top down: towards a predictive biology of signalling networks. , 2003, Trends in biotechnology.

[17]  P. McSharry,et al.  Mathematical and computational techniques to deduce complex biochemical reaction mechanisms. , 2004, Progress in biophysics and molecular biology.

[18]  Eduardo D. Sontag,et al.  Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data , 2004, Bioinform..

[19]  Eduardo Sontag,et al.  Inference of signaling and gene regulatory networks by steady-state perturbation experiments: structure and accuracy. , 2005, Journal of theoretical biology.