Gene clustering using self-organizing maps and particle swarm optimization

Gene clustering, the process of grouping related genes in the same cluster, is at the foundation of different genomic studies that aim at analyzing the function of genes. Microarray technologies have made it possible to measure gene expression levels for thousand of genes simultaneously. For knowledge to be extracted from the datasets generated by these technologies, the datasets have to be presented to a scientist in a meaningful way. Gene clustering methods serve this purpose. In this paper, a hybrid clustering approach that is based on self-organizing maps and particle swarm optimization is proposed. In the proposed algorithm, the rate of convergence is improved by adding a conscience factor to the self-organizing maps algorithm. The robustness of the result is measured by using a resampling technique. The algorithm is implemented on a cluster of workstations.

[1]  Douglas Kerr,et al.  Nervous system development , 2001, Trends in Neurosciences.

[2]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[3]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[5]  Antanas Verikas,et al.  Increasing colour image segmentation accuracy by means of fuzzy post-processing , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[6]  J. Salerno,et al.  Using the particle swarm optimization technique to train a recurrent neural model , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[7]  Eytan Domany,et al.  Resampling Method for Unsupervised Estimation of Cluster Validity , 2001, Neural Computation.

[8]  P. Törönen,et al.  Analysis of gene expression data using self‐organizing maps , 1999, FEBS letters.

[9]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[10]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Michael Q. Zhang,et al.  Evaluation and comparison of clustering algorithms in analyzing es cell gene expression data , 2002 .

[12]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[13]  I. Jolliffe Principal Component Analysis , 2002 .

[14]  Alfonso Valencia,et al.  A hierarchical unsupervised growing neural network for clustering gene expression patterns , 2001, Bioinform..

[15]  David West,et al.  A comparison of SOM neural network and hierarchical clustering methods , 1996 .

[16]  H. Yoshida,et al.  A particle swarm optimization for reactive power and voltage control considering voltage security assessment , 1999, 2001 IEEE Power Engineering Society Winter Meeting. Conference Proceedings (Cat. No.01CH37194).

[17]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[18]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Russell C. Eberhart,et al.  Human tremor analysis using particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[20]  Lani F. Wu,et al.  Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters , 2002, Nature Genetics.

[21]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[22]  T. Villmann,et al.  Topology Preservation in Self-Organizing Maps , 1999 .

[23]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[24]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[25]  T. K. Baker,et al.  Temporal gene expression analysis of monolayer cultured rat hepatocytes. , 2001, Chemical research in toxicology.

[26]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[27]  Duane DeSieno,et al.  Adding a conscience to competitive learning , 1988, IEEE 1988 International Conference on Neural Networks.

[28]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.