DeepTox: Toxicity Prediction using Deep Learning

The Tox21 Data Challenge has been the largest effort of the scientific community to compare computational methods for toxicity prediction. This challenge comprised 12,000 environmental chemicals and drugs which were measured for 12 different toxic effects by specifically designed assays. We participated in this challenge to assess the performance of Deep Learning in computational toxicity prediction. Deep Learning has already revolutionized image processing, speech recognition, and language understanding but has not yet been applied to computational toxicity. Deep Learning is founded on novel algorithms and architectures for artificial neural networks together with the recent availability of very fast computers and massive datasets. It discovers multiple levels of distributed representations of the input, with higher levels representing more abstract concepts. We hypothesized that the construction of a hierarchy of chemical features gives Deep Learning the edge over other toxicity prediction methods. Furthermore, Deep Learning naturally enables multi-task learning, that is, learning of all toxic effects in one neural network and thereby learning of highly informative chemical features. In order to utilize Deep Learning for toxicity prediction, we have developed the DeepTox pipeline. First, DeepTox normalizes the chemical representations of the compounds. Then it computes a large number of chemical descriptors that are used as input to machine learning methods. In its next step, DeepTox trains models, evaluates them, and combines the best of them to ensembles. Finally, DeepTox predicts the toxicity of new compounds. In the Tox21 Data Challenge, DeepTox had the highest performance of all computational methods winning the grand challenge, the nuclear receptor panel, the stress response panel, and six single assays (teams ``Bioinf@JKU''). We found that Deep Learning excelled in toxicity prediction and outperformed many other computational approaches like naive Bayes, support vector machines, and random forests.

[1]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[2]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[3]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[4]  Sepp Hochreiter,et al.  Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[5]  Juan M. Luco,et al.  QSAR Based on Multiple Linear Regression and PLS Methods for the Anti-HIV Activity of a Large Group of HEPT Derivatives , 1997, J. Chem. Inf. Comput. Sci..

[6]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[7]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[8]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[9]  Gregory W. Kauffman,et al.  QSAR and k-Nearest Neighbor Classification Analysis of Selective Cyclooxygenase-2 Inhibitors Using Topologically-Based Numerical Descriptors , 2001, J. Chem. Inf. Comput. Sci..

[10]  R. Evans,et al.  Nuclear receptors and lipid physiology: opening the X-files. , 2001, Science.

[11]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[12]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[13]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Andreas Bender,et al.  Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier , 2004, J. Chem. Inf. Model..

[16]  Xiaoyang Xia,et al.  Classification of kinase inhibitors using a Bayesian model. , 2004, Journal of medicinal chemistry.

[17]  H. Kashima,et al.  Kernels for graphs , 2004 .

[18]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[19]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[20]  T. Ørntoft,et al.  DNA damage response as a candidate anti-cancer barrier in early human tumorigenesis , 2005, Nature.

[21]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[22]  Jean-Philippe Vert,et al.  The Pharmacophore Kernel for Virtual Screening with Support Vector Machines , 2006, J. Chem. Inf. Model..

[23]  Sudhir A. Kulkarni,et al.  Three-Dimensional QSAR Using the k-Nearest Neighbor Method and Its Interpretation , 2006, J. Chem. Inf. Model..

[24]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[25]  Division on Earth Toxicity Testing in the 21st Century: A Vision and a Strategy , 2007 .

[26]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[27]  Yu-Dong Cai,et al.  Support vector machine for SAR/QSAR of phenethyl-amines , 2007, Acta Pharmacologica Sinica.

[28]  F. Grün,et al.  Perturbed nuclear receptor signaling by environmental obesogens as emerging factors in the obesity crisis , 2007, Reviews in Endocrine and Metabolic Disorders.

[29]  G. Labbe,et al.  Drug‐induced liver injury through mitochondrial dysfunction: mechanisms and detection during preclinical safety studies , 2008, Fundamental & clinical pharmacology.

[30]  Klaus Obermayer,et al.  Molecule Kernels: A Descriptor- and Alignment-Free Quantitative Structure-Activity Relationship Approach , 2008, J. Chem. Inf. Model..

[31]  M. Greenberg,et al.  Toxicity Testing in the 21st Century , 2009, Risk analysis : an official publication of the Society for Risk Analysis.

[32]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[33]  Victor Kuzmin,et al.  Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity , 2009, J. Chem. Inf. Model..

[34]  Karen Lowrie,et al.  Toxicity testing in the 21st century. , 2009, Risk analysis : an official publication of the Society for Risk Analysis.

[35]  Melvin E Andersen,et al.  Toxicity testing in the 21st century: bringing the vision to life. , 2009, Toxicological sciences : an official journal of the Society of Toxicology.

[36]  Rajat Raina,et al.  Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[37]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[38]  Klaus Obermayer,et al.  A Maximum Common Subgraph Kernel Method for Predicting the Chromosome Aberration Test , 2010, J. Chem. Inf. Model..

[39]  I. Rusyn,et al.  Computational Toxicology: Realizing the Promise of the Toxicity Testing in the 21st Century , 2010, Environmental health perspectives.

[40]  Ruili Huang,et al.  The future of toxicity testing: a focus on in vitro methods using a quantitative high-throughput screening platform. , 2010, Drug discovery today.

[41]  Rachid Darnag,et al.  Support vector machines: development of QSAR models for predicting anti-HIV-1 activity of TIBO derivatives. , 2010, European journal of medicinal chemistry.

[42]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[43]  J. Bailar,et al.  Toxicity Testing in the 21st Century: A Vision and a Strategy , 2010, Journal of toxicology and environmental health. Part B, Critical reviews.

[44]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[45]  Ralph Kühne,et al.  Tautomer Identification and Tautomer Structure Generation Based on the InChI Code , 2010, J. Chem. Inf. Model..

[46]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[47]  Quoc V. Le,et al.  On optimization methods for deep learning , 2011, ICML.

[48]  Andreas Zell,et al.  jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints , 2011, J. Cheminformatics.

[49]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[50]  Andreas Zell,et al.  Interpreting linear support vector machine models with heat map molecule coloring , 2011, J. Cheminformatics.

[51]  Honglak Lee,et al.  Unsupervised learning of hierarchical representations with convolutional deep belief networks , 2011, Commun. ACM.

[52]  Trevor Hastie,et al.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. , 2011, Journal of statistical software.

[53]  Yizeng Liang,et al.  Kernel k-nearest neighbor algorithm as a flexible SAR modeling tool , 2012 .

[54]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[55]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Yoshua Bengio,et al.  Deep Learning for NLP (without Magic) , 2012, ACL.

[57]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[58]  Luca Maria Gambardella,et al.  Deep Big Multilayer Perceptrons for Digit Recognition , 2012, Neural Networks: Tricks of the Trade.

[59]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[60]  Mitchell R. McGill,et al.  Oxidant stress, mitochondria, and cell death mechanisms in drug-induced liver injury: Lessons learned from acetaminophen hepatotoxicity , 2012, Drug metabolism reviews.

[61]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[62]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[63]  I. Sagardia,et al.  A new QSAR model, for angiotensin I-converting enzyme inhibitory oligopeptides. , 2013, Food chemistry.

[64]  Dong-Sheng Cao,et al.  ChemoPy: freely available python package for computational biology and chemoinformatics , 2013, Bioinform..

[65]  Brian Kingsbury,et al.  New types of deep neural network learning for speech recognition and related applications: an overview , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[66]  K Mikael,et al.  Deep Learning for NLP , 2013 .

[67]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[68]  Navdeep Jaitly,et al.  Multi-task Neural Networks for QSAR Predictions , 2014, ArXiv.

[69]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.

[70]  David M. Reif,et al.  Profiling of the Tox21 10K compound library for agonists and antagonists of the estrogen receptor alpha signaling pathway , 2014, Scientific Reports.

[71]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[72]  Ana L. Teixeira,et al.  Prediction of human population responses to toxic compounds by a collaborative competition , 2015, Nature Biotechnology.

[73]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[74]  Sepp Hochreiter,et al.  Rchemcpp: a web service for structural analoging in ChEMBL, Drugbank and the Connectivity Map , 2015, Bioinform..

[75]  Robert P. Sheridan,et al.  Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships , 2015, J. Chem. Inf. Model..

[76]  Sepp Hochreiter,et al.  Toxicity Prediction using Deep Learning , 2015, ArXiv.

[77]  Andreas Mayr,et al.  Deep Learning as an Opportunity in Virtual Screening , 2015 .

[78]  Sepp Hochreiter,et al.  Rectified Factor Networks , 2015, NIPS.

[79]  Bie M. P. Verbist,et al.  Using transcriptomics to guide lead optimization in drug discovery projects: Lessons learned from the QSTAR project. , 2015, Drug discovery today.

[80]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..