Physicochemical Methods for Prediction of Functional Information for Proteins

Structural genomics initiatives are determining thousands of new protein structures. Many of these structures are of unknown function, and computational methods for the rapid determination of functional information from protein structure are needed. We present details of how functional information is obtained from the structure using THEMATICS (Theoretical Microscopic Titration Curves). THEMATICS is a computational procedure that gives information about chemical reactivity, based on solution of the Poisson-Boltzmann equations for the electrical potential function. We show how anomalies in predicted titration curves are established. We show further that when residues with anomalous predicted titration curves form a cluster in physical space, these residues tend to be very highly conserved across species and such clusters are reliable predictors of the active site. Results are given for ten enzymes; detailed results are shown for the enzymes triosephosphate isomerase (from chicken), 6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase (from E. coli), and papain (from papaya).

[1]  J. Warwicker,et al.  Calculation of the electric potential in the active site cleft due to alpha-helix dipoles. , 1982, Journal of molecular biology.

[2]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[3]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[4]  J. Knowles,et al.  Neutral imidazole is the electrophile in the reaction catalyzed by triosephosphate isomerase: structural origins and catalytic implications. , 1991, Biochemistry.

[5]  M. Karplus,et al.  Multiple-site titration curves of proteins: an analysis of exact and approximate methods for their calculation , 1991 .

[6]  K. Tomoo,et al.  Crystal structure of papain-succinyl-Gln-Val-Val-Ala-Ala-p-nitroanilide complex at 1.7-A resolution: noncovalent binding mode of a common sequence of endogenous thiol protease inhibitors. , 1992, Biochemistry.

[7]  D. Bashford,et al.  Electrostatic calculations of the pKa values of ionizable groups in bacteriorhodopsin , 1992 .

[8]  T. Oshima,et al.  Tyr‐139 in Thermus thermophilus 3‐isopropylmalate dehydrogenase is involved in catalytic function , 1993, FEBS letters.

[9]  K. Sharp,et al.  On the calculation of pKas in proteins , 1993, Proteins.

[10]  David J. States,et al.  Identification of protein coding regions by database similarity search , 1993, Nature Genetics.

[11]  M. Gilson Multiple‐site titration and molecular modeling: Two rapid methods for computing energies and forces for ionizable groups in proteins , 1993, Proteins.

[12]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[13]  G. Petsko,et al.  Crystal structure of recombinant chicken triosephosphate isomerase-phosphoglycolohydroxamate complex at 1.8-A resolution. , 1994, Biochemistry.

[14]  A. Karshikoff A simple algorithm for the calculation of multiple site titration curves. , 1995, Protein engineering.

[15]  L. R. Scott,et al.  Electrostatics and diffusion of molecules in solution: simulations with the University of Houston Brownian dynamics program , 1995 .

[16]  G. Petsko,et al.  Crystal structure of a D-amino acid aminotransferase: how the protein controls stereoselectivity. , 1995, Biochemistry.

[17]  M. Gilson,et al.  The determinants of pKas in proteins. , 1996, Biochemistry.

[18]  R. Wolfenden,et al.  Cytidine deaminase complexed to 3-deazacytidine: a "valence buffer" in zinc enzyme catalysis. , 1996, Biochemistry.

[19]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[20]  M. Gilson,et al.  Computing ionization states of proteins with a detailed charge model , 1996, J. Comput. Chem..

[21]  C. Lima,et al.  Structure-based analysis of catalysis and substrate definition in the HIT protein family. , 1997, Science.

[22]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[23]  E. Alexov,et al.  Incorporating protein conformational flexibility into the calculation of pH-dependent protein properties. , 1997, Biophysical journal.

[24]  Sung-Hou Kim Shining a light on structural genomics , 1998, Nature Structural Biology.

[25]  J. Cherfils,et al.  Redox signalling in the chloroplast: structure of oxidized pea fructose‐1,6‐bisphosphate phosphatase , 1999, The EMBO journal.

[26]  Honggao Yan,et al.  Crystal structure of 6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase, a potential target for the development of novel antimicrobial agents. , 1999, Structure.

[27]  A. Sali,et al.  Structural genomics: beyond the Human Genome Project , 1999, Nature Genetics.

[28]  F Guarnieri,et al.  A self-consistent, microenvironment modulated screened coulomb potential approximation to calculate pH-dependent electrostatic effects in proteins. , 1999, Biophysical journal.

[29]  G. Montelione,et al.  A banner year for membranes , 1999, Nature Structural Biology.

[30]  G Klebe,et al.  Improving macromolecular electrostatics calculations. , 1999, Protein engineering.

[31]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[32]  Georg E. Schulz,et al.  Catalytic Action of Fuculose 1-Phosphate Aldolase (Class II) as Derived from Structure-Directed Mutagenesis , 2000 .

[33]  Chris Sander,et al.  Completeness in structural genomics , 2001, Nature Structural Biology.

[34]  M. Ondrechen,et al.  THEMATICS: A simple computational predictor of enzyme function from structure , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Y. Inoue,et al.  Effects of hydration on the electronic structure of an enzyme: implications for the catalytic function. , 2001, Journal of the American Chemical Society.

[36]  O. Lichtarge,et al.  Evolutionary predictions of binding surfaces and interactions. , 2002, Current opinion in structural biology.

[37]  Gail J. Bartlett,et al.  Analysis of catalytic residues in enzyme active sites. , 2002, Journal of molecular biology.

[38]  M. Ondrechen THEMATICS as a Tool for Functional Genomics , 2002 .

[39]  E. Alexov,et al.  Combining conformational flexibility and continuum electrostatics for calculating pK(a)s in proteins. , 2002, Biophysical journal.

[40]  Tal Pupko,et al.  A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families , 2002, Bioinform..

[41]  L. Kavraki,et al.  An accurate, sensitive, and scalable method to identify functional sites in protein structures. , 2003, Journal of molecular biology.

[42]  Gregory A Petsko,et al.  2.4 A resolution crystal structure of the prototypical hormone-processing protease Kex2 in complex with an Ala-Lys-Arg boronic acid inhibitor. , 2003, Biochemistry.

[43]  Tal Pupko,et al.  ConSurf: Identification of Functional Regions in Proteins by Surface-Mapping of Phylogenetic Information , 2003, Bioinform..

[44]  Ihsan A. Shehadi,et al.  Future directions in protein function prediction , 2002, Molecular Biology Reports.