Swarm Intelligence Algorithms in Bioinformatics

Research in bioinformatics necessitates the use of advanced computing tools for processing huge amounts of ambiguous and uncertain biological data. Swarm Intelligence (SI) has recently emerged as a family of nature inspired algorithms, especially known for their ability to produce low cost, fast and reasonably accurate solutions to complex search problems. In this chapter, we explore the role of SI algorithms in certain bioinformatics tasks like microarray data clustering, multiple sequence alignment, protein structure prediction and molecular docking. The chapter begins with an overview of the basic concepts of bioinformatics along with their biological basis. It also gives an introduction to swarm intelligence with special emphasis on two specific SI algorithms well-known as Particle Swarm Optimization (PSO) and Ant Colony Systems (ACS). It then provides a detailed survey of the state of the art research centered around the applications of SI algorithms in bioinformatics. The chapter concludes with a discussion on how SI algorithms can be used for solving a few open ended problems in bioinformatics.

[1]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[2]  Masahiro Okamoto,et al.  Nonlinear Numerical Optimization Technique Based on a Genetic Algorithm for Inverse Problems: Towards the Inference of Genetic Networks , 1999, German Conference on Bioinformatics.

[3]  Ron Unger,et al.  Genetic Algorithm for 3D Protein Folding Simulations , 1993, ICGA.

[4]  D. Spellmeyer,et al.  Textbook of Drug Design and Discovery , 2003 .

[5]  Hitoshi Iba,et al.  Ant algorithm for construction of evolutionary tree , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[6]  Maurice G. Kendall,et al.  The advanced theory of statistics , 1945 .

[7]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998, J. Comput. Chem..

[8]  Hitoshi Iba,et al.  Chapter 12 – Identifying Metabolic Pathways and Gene Regulation Networks with Evolutionary Algorithms , 2003 .

[9]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[10]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[11]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  G. Di Caro,et al.  Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[13]  J. Kennedy,et al.  Population structure and particle swarm performance , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[14]  P. Lewis,et al.  A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. , 1998, Molecular biology and evolution.

[15]  Alfonso Valencia,et al.  A hierarchical unsupervised growing neural network for clustering gene expression patterns , 2001, Bioinform..

[16]  Amit Konar,et al.  An efficient evolutionary algorithm applied to the design of two-dimensional IIR filters , 2005, GECCO '05.

[17]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[18]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[19]  Annie S. Wu,et al.  A Survey of Intron Research in Genetics , 1996, PPSN.

[20]  D. Beveridge,et al.  Exploratory studies of ab initio protein structure prediction: Multiple copy simulated annealing, AMBER energy functions, and a generalized born/solvent accessibility solvation model , 2002, Proteins.

[21]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[22]  T. K. Baker,et al.  Temporal gene expression analysis of monolayer cultured rat hepatocytes. , 2001, Chemical research in toxicology.

[23]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[24]  J. Deneubourg,et al.  Application de l'ordre par fluctuations a la description de certaines étapes de la construction du nid chez les Termites , 1977, Insectes Sociaux.

[25]  T. Pitcher,et al.  The sensory basis of fish schools: Relative roles of lateral line and vision , 1980, Journal of comparative physiology.

[26]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[27]  C. Branden,et al.  Introduction to protein structure , 1991 .

[28]  S. Mitra,et al.  Bioinformatics with soft computing , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[29]  F M Richards,et al.  Areas, volumes, packing and protein structure. , 1977, Annual review of biophysics and bioengineering.

[30]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[31]  P. Törönen,et al.  Analysis of gene expression data using self‐organizing maps , 1999, FEBS letters.

[32]  B L Partridge,et al.  The structure and function of fish schools. , 1982, Scientific American.

[33]  Andries Petrus Engelbrecht,et al.  Fundamentals of Computational Swarm Intelligence , 2005 .

[34]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[35]  M. Newman,et al.  Epidemics and percolation in small-world networks. , 1999, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[36]  R. Fisher The Advanced Theory of Statistics , 1943, Nature.

[37]  Yi Pan,et al.  Partitioned optimization algorithms for multiple sequence alignment , 2006, 20th International Conference on Advanced Information Networking and Applications - Volume 1 (AINA'06).

[38]  J. Skolnick,et al.  A minimal physically realistic protein-like lattice model: designing an energy landscape that ensures all-or-none folding to a unique native state. , 2003, Biophysical journal.

[39]  L. Dill,et al.  The three-dimensional structure of airborne bird flocks , 1978, Behavioral Ecology and Sociobiology.

[40]  I. Couzin,et al.  Collective memory and spatial sorting in animal groups. , 2002, Journal of theoretical biology.

[41]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[42]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[43]  Shiow-Fen Hwang,et al.  Flexible protein-ligand docking using particle swarm optimization , 2005, 2005 IEEE Congress on Evolutionary Computation.

[44]  Robert C. Glen,et al.  A genetic algorithm for the automated generation of molecules within constraints , 1995, J. Comput. Aided Mol. Des..

[45]  Heitor Silvério Lopes,et al.  Reconstruction of Phylogenetic Trees using the Ant Colony Optimization Paradigm , 2005, WOB.

[46]  Luca Maria Gambardella,et al.  A Study of Some Properties of Ant-Q , 1996, PPSN.

[47]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[48]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[49]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[50]  Albert Y. Zomaya,et al.  Parallel ant colony optimization for 3D protein structure prediction using the HP lattice model , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[51]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[52]  Holger H. Hoos,et al.  An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem , 2005, BMC Bioinformatics.

[53]  Hitoshi Iba,et al.  Inference of gene regulatory model by genetic algorithms , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[54]  Andries Petrus Engelbrecht,et al.  Determining RNA Secondary Structure using Set-based Particle Swarm Optimization , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[55]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[56]  A. Lemmon,et al.  The metapopulation genetic algorithm: An efficient solution for the problem of large phylogeny estimation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[57]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[58]  Holger H. Hoos,et al.  An Improved Ant Colony Optimisation Algorithm for the 2D HP Protein Folding Problem , 2003, Canadian Conference on AI.

[59]  F Jasch,et al.  Trapping of random walks on small-world networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[60]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[61]  James Kennedy,et al.  Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[62]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[63]  Witold Pedrycz,et al.  Special Issue on Bioinformatics , 2006, Pattern Recognit..

[64]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[65]  Vittorio Loreto,et al.  Agreement dynamics on small-world networks , 2006, cond-mat/0603205.

[66]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[67]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[68]  William E. Hart,et al.  Protein structure prediction with evolutionary algorithms , 1999 .

[69]  Jing Wang,et al.  Swarm Intelligence in Cellular Robotic Systems , 1993 .

[70]  Christopher G. Langton,et al.  Artificial Life III , 2000 .

[71]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[72]  G. Flake The Computational Beauty of Nature , 1998 .

[73]  R. Doolittle,et al.  Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[74]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining , 2009, Swarm Intelligence in Data Mining.

[75]  Roberto Serra,et al.  Complex Systems and Cognitive Processes , 1990, Springer Berlin Heidelberg.

[76]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[77]  René Thomsen,et al.  A comparative study of differential evolution, particle swarm optimization, and evolutionary algorithms on numerical benchmark problems , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[78]  Hitoshi Iba,et al.  Quantitative Modeling of Gene Regulatory Network:Identifying the Network by Means of Genetic Algorithms , 2000 .

[79]  M. Weigt,et al.  On the properties of small-world network models , 1999, cond-mat/9903411.

[80]  Shayne C. Gad,et al.  Drug Discovery Handbook , 1994 .

[81]  S. Altschul,et al.  A tool for multiple sequence alignment. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[82]  Thomas Kiel Rasmussen,et al.  Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid. , 2003, Bio Systems.

[83]  N M Luscombe,et al.  What is Bioinformatics? A Proposed Definition and Overview of the Field , 2001, Methods of Information in Medicine.

[84]  V. Nanjundiah,et al.  Trans gene regulation in adaptive evolution: a genetic algorithm model. , 1997, Journal of theoretical biology.

[85]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[86]  K. Dill,et al.  A lattice statistical mechanics model of the conformational and sequence spaces of proteins , 1989 .

[87]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[88]  Nachol Chaiyaratana,et al.  DNA fragment assembly using an ant colony system algorithm , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[89]  O. Weck,et al.  A COMPARISON OF PARTICLE SWARM OPTIMIZATION AND THE GENETIC ALGORITHM , 2005 .

[90]  J. Felsenstein Maximum-likelihood estimation of evolutionary trees from continuous characters. , 1973, American journal of human genetics.

[91]  Mark M. Millonas,et al.  Swarms, Phase Transitions, and Collective Intelligence , 1993, adap-org/9306002.

[92]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[93]  Venkat Venkatasubramanian,et al.  Evolutionary Design of Molecules with Desired Properties Using the Genetic Algorithm , 1995, J. Chem. Inf. Comput. Sci..

[94]  R. R. Krausz Living in Groups , 2013 .

[95]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[96]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[97]  Satoru Miyano,et al.  Challenges for Intelligent Systems in Biology , 2001, IEEE Intell. Syst..