An analysis of the use of probabilistic modeling for synaptic connectivity prediction from genomic data

The identification of the specific genes that influence particular phenotypes is a common problem in genetic studies. In this paper we address the problem of determining the influence of gene joint expression in synapse predictability. The question is posed as an optimization problem in which the conditional entropy of gene subsets with respect to the synaptic connectivity phenotype is minimized. We investigate the use of single- and multi-objective estimation of distribution algorithms and focus on real data from C. elegans synaptic connectivity. We show that the introduced algorithms are able to compute gene sets that allow an accurate synapse predictability. However, the multi-objective approach can simultaneously search for gene sets with different number of genes. Our results also indicate that optimization problems defined on constrained binary spaces remain challenging for the conception of competitive estimation of distribution algorithm.

[1]  Shao-Ping Wang,et al.  Random walks on the neural network of C.elegans , 2008, 2008 International Conference on Neural Networks and Signal Processing.

[2]  Carlos A. Coello Coello,et al.  GIAA TECHNINAL REPORT GIAA2010E001 On Current Model-Building Methods for Multi-Objective Estimation of Distribution Algorithms: Shortcommings and Directions for Improvement , 2010 .

[3]  S. Brenner,et al.  The structure of the nervous system of the nematode Caenorhabditis elegans. , 1986, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4]  D. Anastassiou Computational analysis of the synergy among multiple interacting genes , 2007, Molecular systems biology.

[5]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[6]  Isaac Meilijson,et al.  Gene Expression of Caenorhabditis elegans Neurons Carries Information on Their Synaptic Connectivity , 2006, PLoS Comput. Biol..

[7]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[8]  Eran Segal,et al.  Using Expression Profiles of Caenorhabditis elegans Neurons To Identify Genes That Mediate Synaptic Connectivity , 2008, PLoS Comput. Biol..

[9]  Roberto Santana,et al.  The Incident Edge Model , 2005 .

[10]  Pablo Moscato,et al.  On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts : Towards Memetic Algorithms , 1989 .

[11]  D. Chklovskii,et al.  Wiring optimization can relate neuronal structure and function. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[12]  R. Santana,et al.  The mixture of trees Factorized Distribution Algorithm , 2001 .

[13]  Lav R. Varshney,et al.  Structural Properties of the Caenorhabditis elegans Neuronal Network , 2009, PLoS Comput. Biol..

[14]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[15]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[16]  Javier DeFelipe,et al.  From the Connectome to the Synaptome: An Epic Love Story , 2010, Science.

[17]  Dirk Thierens,et al.  Multi-objective optimization with diversity preserving mixture-based iterated density estimation evolutionary algorithms , 2002, Int. J. Approx. Reason..

[18]  David M. Miller,et al.  Computational inference of the molecular logic for synaptic connectivity in C. elegans , 2006, ISMB.

[19]  S. Baluja,et al.  Using Optimal Dependency-Trees for Combinatorial Optimization: Learning the Structure of the Search Space , 1997 .

[20]  E. Hertzberg,et al.  Gap junctions: New tools, new answers, new questions , 1991, Neuron.

[21]  Concha Bielza,et al.  Mateda-2.0: A MATLAB package for the implementation and analysis of estimation of distribution algorithms , 2010 .

[22]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..

[23]  Concha Bielza,et al.  A review of estimation of distribution algorithms in bioinformatics , 2008, BioData Mining.

[24]  Andreas Zell,et al.  Inferring Disease-Related Metabolite Dependencies with a Bayesian Optimization Algorithm , 2012, EvoBIO.

[25]  Pedro Larrañaga,et al.  Adding Probabilistic Dependencies to the Search of Protein Side Chain Configurations Using EDAs , 2008, PPSN.