Bayesian Gene Set Analysis

Author(s): Shahbaba, Babak; Tibshirani, Robert; Shachaf, Catherine M; Plevritis, Sylvia K | Abstract: Gene expression microarray technologies provide the simultaneous measurements of a large number of genes. Typical analyses of such data focus on the individual genes, but recent work has demonstrated that evaluating changes in expression across predefined sets of genes often increases statistical power and produces more robust results. We introduce a new methodology for identifying gene sets that are differentially expressed under varying experimental conditions. Our approach uses a hierarchical Bayesian framework where a hyperparameter measures the significance of each gene set. Using simulated data, we compare our proposed method to alternative approaches, such as Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA). Our approach provides the best overall performance. We also discuss the application of our method to experimental data based on p53 mutation status.

[1]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[2]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[3]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[4]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[5]  C. Harris,et al.  The IARC TP53 database: New online mutation analysis and recommendations to users , 2002, Human mutation.

[6]  Korbinian Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology , 2005 .

[7]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[8]  Thomas Lengauer,et al.  Statistical Applications in Genetics and Molecular Biology Calculating the Statistical Significance of Changes in Pathway Activity From Gene Expression Data , 2011 .

[9]  M. Newton,et al.  Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis , 2007, 0708.4350.

[10]  Hongzhe Li,et al.  Group additive regression models for genomic data analysis. , 2008, Biostatistics.

[11]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[12]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Jean-Jacques Daudin,et al.  Mixture model on the variance for the differential analysis of gene expression data , 2005 .

[14]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[15]  M. Caligiuri,et al.  Expression profiling reveals fundamental biological differences in acute myeloid leukemia with isolated trisomy 8 and normal cytogenetics. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  J. S. Rao,et al.  Detecting Differentially Expressed Genes in Microarrays Using Bayesian Model Selection , 2003 .

[17]  M. West,et al.  Bayesian Modeling for Biological Pathway Annotation of Genomic Signatures , 2008 .

[18]  R. Tibshirani,et al.  On testing the significance of sets of genes , 2006, math/0610667.

[19]  William Stafford Noble,et al.  Exploring Gene Expression Data with Class Scores , 2001, Pacific Symposium on Biocomputing.

[20]  J. S. Rao,et al.  Spike and Slab Gene Selection for Multigroup Microarray Data , 2005 .

[21]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[22]  R. Gottardo,et al.  Statistical analysis of microarray data: a Bayesian approach. , 2003, Biostatistics.

[23]  P. Müller,et al.  A Bayesian mixture model for differential gene expression , 2005 .

[24]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[26]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[27]  D. Damian,et al.  Statistical concerns about the GSEA procedure , 2004, Nature Genetics.