Evolutionary Ensemble for In Silico Prediction of Ames Test Mutagenicity

Driven by new regulations and animal welfare, the need to develop in silicomodels has increased recently as alternative approaches to safety assessment of chemicals without animal testing. This paper describes a novel machine learning ensemble approach to building an in silicomodel for the prediction of the Ames test mutagenicity, one of a battery of the most commonly used experimental in vitro and in vivo genotoxicity tests for safety evaluation of chemicals. Evolutionary random neural ensemble with negative correlation learning (ERNE) [1] was developed based on neural networks and evolutionary algorithms. ERNE combines the method of bootstrap sampling on training data with the method of random subspace feature selection to ensure diversity in creating individuals within an initial ensemble. Furthermore, while evolving individuals within the ensemble, it makes use of the negative correlation learning, enabling individual NNs to be trained as accurate as possible while still manage to maintain them as diverse as possible. Therefore, the resulting individuals in the final ensemble are capable of cooperating collectively to achieve better generalization of prediction. The empirical experiment suggest that ERNE is an effective ensemble approach for predicting the Ames test mutagenicity of chemicals.

[1]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[2]  Xin Yao,et al.  Simultaneous training of negatively correlated neural networks in an ensemble , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[3]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[4]  Peter C Jurs,et al.  Predicting the genotoxicity of polycyclic aromatic compounds from molecular structure with different classifiers. , 2003, Chemical research in toxicology.

[5]  Lutz Müller,et al.  Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity. , 2005, Mutation research.

[6]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[7]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  L. Hall,et al.  Three new consensus QSAR models for the prediction of Ames genotoxicity. , 2004, Mutagenesis.

[11]  João Aires-de-Sousa,et al.  Random Forest Prediction of Mutagenicity from Empirical Physicochemical Descriptors. , 2007 .

[12]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Hussein A. Abbass,et al.  Analyzing Anti–correlation in Ensemble Learning , 2007 .

[14]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[15]  David Kirkland,et al.  Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens III. Appropriate follow-up testing in vivo. , 2005, Mutation research.

[16]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[17]  Manfred M. Fischer,et al.  Neural network ensembles and their application to traffic flow prediction in telecommunications networks , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[18]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[19]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[20]  Huanhuan Chen,et al.  Evolutionary random neural ensembles based on negative correlation learning , 2007, 2007 IEEE Congress on Evolutionary Computation.