Predicting Degree of Relevance of Pathway Markers from Gene Expression Data: A PSO Based Approach

In functional genomics, a pathway is defined as a set of genes which exhibit similar biological activities. Given a microarray expression data, the corresponding pathway information can be extracted with the use of some public databases. All member genes of a given pathway may not be equally relevant in estimating the activity of that pathway. Some genes can participate adequately in the given pathway, some may have low-associations. Existing literature has either considered all the genes wholly or discarded some genes completely in estimating the corresponding pathway-activity. Inspired by this, the current work reports about an automated approach to measure the degree of relevance of a given gene in predicting the pathway-activity. As a large search space has to be dragged, the exploration properties of particle swarm optimization are utilized in the current context. Particles of the PSO represent different scores of relevance for the member genes of different pathways. In order to deal with the relevance-score, the popular t-score which is widely used in measuring the pathway-activity is expanded in the name of weighted t-score. The proposed PSO-based weighted framework is then evaluated on three gene expression data sets. In order to show the supremacy of the proposed method, top 50% pathway markers are selected for each data set and the quality of these measures is checked after performing 10-fold cross-validation with respect to different quality measures. The results are further validated using biological significance tests.

[1]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[2]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Anirban Mukhopadhyay,et al.  A PSO-Based Approach for Pathway Marker Identification From Gene Expression Data , 2015, IEEE Transactions on NanoBioscience.

[4]  E. Dougherty,et al.  Accurate and Reliable Cancer Classification Based on Probabilistic Inference of Pathway Activity , 2009, PloS one.

[5]  Jian Pei,et al.  A rank sum test method for informative gene discovery , 2004, KDD.

[6]  Kuo-Chen Chou,et al.  Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property , 2011, PloS one.

[7]  Fillia Makedon,et al.  HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data , 2005, Bioinform..

[8]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[9]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[10]  Sriparna Saha,et al.  Fusion of expression values and protein interaction information using multi-objective optimization for improving gene clustering , 2017, Comput. Biol. Medicine.

[11]  Anirban Mukhopadhyay,et al.  A Graph-Theoretic Approach for Identifying Non-Redundant and Relevant Gene Markers from Microarray Data Using Multiobjective Binary PSO , 2014, PloS one.

[12]  Michael R. Kosorok,et al.  Identification of differential gene pathways with principal component analysis , 2009, Bioinform..

[13]  Hongyu Zhao,et al.  Building pathway clusters from Random Forests classification using class votes , 2008, BMC Bioinformatics.

[14]  Byung-Jun Yoon,et al.  Identification of Robust Pathway Markers for Cancer through Rank-Based Pathway Activity Inference , 2013, Adv. Bioinformatics.

[15]  Hsin-Chih Lai,et al.  Activation of Multiple Apoptotic Pathways in Human Nasopharyngeal Carcinoma Cells by the Prenylated Isoflavone, Osajin , 2011, PloS one.

[16]  Jianzhong Li,et al.  A stable gene selection in microarray data analysis , 2006, BMC Bioinformatics.

[17]  Anirban Mukhopadhyay,et al.  A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Qing Wang,et al.  Towards precise classification of cancers based on robust gene functional expression profiles , 2005, BMC Bioinformatics.

[19]  Michael N. Vrahatis,et al.  Particle Swarm Optimization and Intelligence: Advances and Applications , 2010 .

[20]  Anirban Mukhopadhyay,et al.  Identifying Non-Redundant Gene Markers from Microarray Data: A Multiobjective Variable Length PSO-Based Approach , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[22]  Bibhas Chandra Dhara,et al.  Selection of genes mediating certain cancers, using a neuro-fuzzy approach , 2014, Neurocomputing.

[23]  Kai Wang,et al.  Pathway-based approaches for analysis of genomewide association studies. , 2007, American journal of human genetics.