De Novo Molecular Design Using Graph Kernels

Erklärung Hiermit erkläre ich, dass ich diese Abschlussarbeit selbständig verfasst habe, keine anderen als die angegebenen Quellen/Hilfsmittel verwendet habe und alle Stellen, die wörtlich oder sinngemäßaus veröffentlichten Schriften entnommen wurden, als solche kenntlich gemacht habe. Darüber hinaus erkläre ich, dass diese Abschluss-arbeit nicht, auch nicht auszugsweise, bereits für eine andere Prüfung angefertigt wurde. Acknowledgements First I would like to thank Dr. Fabrizio Costa who proposed the subject of this thesis and closely supervised me through out the process. He came up with a lot of ideas that literally made this project possible. I would also like to thank him for the code of the NSPDK kernel which plays a major role in GraSCoM implementation. In the end I want to thank him for offering me the opportunity of working on such an interesting project. I cannot imagine a topic that I could have enjoyed more working on for my Master Thesis. I would also like to thank Prof. Dr. Rolf Backofen for having me part of the group (as a student assistant) over the years of my study. I have learned a lot and I definitely enjoyed this experience. Another big thank you needs to go to the group of Dr. Hauke Busch from the ZBSA for allowing me to run some of the experiments on their cluster. I want to also thank my parents for encouraging and supporting me over the years. A final but very special thanks goes to my girlfriend Ilinca, who supported me and made my days working on the thesis a lot nicer. Summary Synthesis of small molecules that improve on the curative properties of existing drugs or that are effective in curing previously untreatable illnesses is a very hard task on which pharmaceutical companies are investing enormous amounts of resources. Despite this, studies show that only one out of 5000 screened drug candidates reaches the market and therefore the pharmaceutical companies are looking for fail fast, fail cheap solutions. In this context, computational methods become an interesting alternative if they manage to replace the expensive and time consuming phases of design, synthesis and test. Among such methods, the computer-aided de novo molecular design approaches are particularly interesting as they produce from scratch novel molecular structures with desired pharmacological properties in an incremental fashion. One of the biggest challenges such systems have to face is the exploration of a practically infinite chemical space. In this …

[1]  H. O. Foulkes Abstract Algebra , 1967, Nature.

[2]  Hans-Joachim Böhm,et al.  The computer program LUDI: A new method for the de novo design of enzyme inhibitors , 1992, J. Comput. Aided Mol. Des..

[3]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[4]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[5]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[6]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[7]  Luhua Lai,et al.  RASSE: A New Method for Structure-Based Drug Design , 1996, J. Chem. Inf. Comput. Sci..

[8]  Alexander Clark,et al.  Polynomial Identification in the Limit of Substitutable Context-free Languages , 2005 .

[9]  David A. Pearlman,et al.  CONCEPTS: New dynamic algorithm for de novo drug suggestion , 1993, J. Comput. Chem..

[10]  Valerie J. Gillet,et al.  SPROUT: A program for structure generation , 1993, J. Comput. Aided Mol. Des..

[11]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[12]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[13]  W. Guida,et al.  The art and practice of structure‐based drug design: A molecular modeling perspective , 1996, Medicinal research reviews.

[14]  David Weimer Bibliography , 2018, Medical History. Supplement.

[15]  Fabrizio Costa,et al.  Fast Neighborhood Subgraph Pairwise Distance Kernel , 2010, ICML.

[16]  Julian Tirado-Rives,et al.  Computer-aided design of non-nucleoside inhibitors of HIV-1 reverse transcriptase. , 2006, Bioorganic & medicinal chemistry letters.

[17]  G. Bemis,et al.  BREED: Generating novel inhibitors through hybridization of known ligands. Application to CDK2, p38, and HIV protease. , 2004, Journal of medicinal chemistry.

[18]  Umberto Castellani,et al.  Multiple kernel learning , 2009 .

[19]  Dragos Horvath,et al.  Neighborhood Behavior of in Silico Structural Spaces with Respect to in Vitro Activity Spaces-A Novel Understanding of the Molecular Similarity Principle in the Context of Multiple Receptor Binding Profiles , 2003, J. Chem. Inf. Comput. Sci..

[20]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[21]  H. M. Vinkers,et al.  SYNOPSIS: SYNthesize and OPtimize System in Silico. , 2003, Journal of medicinal chemistry.

[22]  G. Schneider,et al.  Enabling future drug discovery by de novo design , 2011 .

[23]  Gisbert Schneider,et al.  Computer-based de novo design of drug-like molecules , 2005, Nature Reviews Drug Discovery.

[24]  Akiko Itai,et al.  Automatic creation of drug candidate structures based on receptor structure. Starting point for artificial lead generation , 1991 .

[25]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[26]  Richard A. Lewis,et al.  Automated site-directed drug design : the formation of molecular templates in primary structure generation , 1989, Proceedings of the Royal Society of London. B. Biological Sciences.

[27]  M. Murcko,et al.  CONCERTS: dynamic connection of fragments as an approach to de novo ligand design. , 1996, Journal of medicinal chemistry.

[28]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[29]  H. Kashima,et al.  Kernels for graphs , 2004 .

[30]  Petra Schneider,et al.  De novo design of molecular architectures by evolutionary assembly of drug-derived building blocks , 2000, J. Comput. Aided Mol. Des..

[31]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.