Exploiting the Japanese Toxicogenomics Project for Predictive Modelling of Drug Toxicity

Motivation In the last decade, surprisingly few drugs reached the market. Many promising drug candidates (approx. 80%) failed during or after Phase I, inter alia, due to issues with undetected toxicity [1]. The problem of undetected toxicity becomes even more apparent in the context of drug-induced illness which causes approximately 100,000 deaths per year solely in the USA [2]. Toxicogenomics tries to avoid such problems by prioritizing less toxic drugs over more toxic ones in early drug discovery. To this end, toxicogenomics employs high throughput molecular profiling technologies and predicts the toxicity of drug candidates. For this prediction, large-scale -omics studies of drug treated cell-lines and/or pharmacology model organisms are necessary. However, data exploitation of such large-scale studies requires a highly optimized analysis pipeline, that provides methods for correction of batch effects, noise reduction, dimensionality reduction, normalization, summarization, filtering and prediction. In this work, we present a novel pipeline for the analysis of large-scale data sets in particular for transcriptomics data. Our pipeline was tested on the Japanese Toxicogenomics Project (TGP) [3], where we evaluated to what degree in vitro bioassays can be used to predict in vivo responses.

[1]  P. Corey,et al.  Incidence of Adverse Drug Reactions in Hospitalized Patients , 2012 .

[2]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[3]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[4]  Klaus Obermayer,et al.  A new summarization method for affymetrix probe level data , 2006, Bioinform..

[5]  Klaus Obermayer,et al.  Support Vector Machines for Dyadic Data , 2006, Neural Computation.

[6]  Hinrich W. H. Göhlmann,et al.  I/NI-calls for the exclusion of non-informative genes: a highly effective filtering tool for microarray data , 2007, Bioinform..

[7]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[8]  Adetayo Kasim,et al.  Informative or Noninformative Calls for Gene Expression: A Latent Variable Approach , 2010, Statistical applications in genetics and molecular biology.

[9]  Adetayo Kasim,et al.  Filtering data from high-throughput experiments based on measurement reliability , 2010, Proceedings of the National Academy of Sciences.

[10]  H. Yamada,et al.  The Japanese toxicogenomics project: application of toxicogenomics. , 2010, Molecular nutrition & food research.

[11]  Weida Tong,et al.  FDA-approved drug labeling for the study of drug-induced liver injury. , 2011, Drug discovery today.