Detection of Nonlinear Effects in Gene Expression Pathways

One of the main topics in systems biology is to model genetic pathways. Genes of a pathway, which show linear dependencies of their expression values, are easy to identify to belong to the pathway. However, if feedback loops or signal cascades are present, gene expression values of pathway genes can be nonlinearly dependent on the expression values of other genes in the pathway. In this situation such genes are hard to detect as belonging to the pathway because nonlinearity and noise must be distinguished.We propose an algorithm to infer nonlinear network elements in pathways from microarray data. Our model assumes, that gene expression values, belonging to one pathway, are mainly driven by one single latent factor. We expect that there are two groups of genes in a pathway: genes belonging to the first group are linearly dependent on the hidden factor, genes from the other group show a nonlinear dependence from the latent variable. The goal is to identify the kind of dependence from the hidden factor.Our algorithm for detecting nonlinear effects is an extension of linear Gaussian factor analysis. Nonlinearities are modelled by the square of the latent variable weighted by specific coefficients. We derived a novel model selection method for this generalization of factor analysis. To avoid the interpretation of noise as nonlinearity, we determine p-values that measure the probability of a linear gene being detected by chance as nonlinear. We apply our algorithm to microarray data of breast cancer samples, where we identified nonlinear dependencies of gene expression values in the p53 pathway.