Statistical criteria for the identification of protein active sites using theoretical microscopic titration curves

Theoretical Microscopic Titration Curves (THEMATICS) may be used to identify chemically important residues in active sites of enzymes by characteristic deviations from the normal, sigmoidal Henderson–Hasselbalch titration behavior. Clusters of such deviant residues in physical proximity constitute reliable predictors of the location of the active site. Originally the residues with deviant predicted behavior were identified by human observation of the computed titration curves. However, it is preferable to select the unusual residues by mathematically well‐defined criteria, in order to reduce the chance of error, eliminate any possible biases, and substantially speed up the selection process. Here we present some simple statistical tests that constitute such selection criteria. The first derivatives of the predicted titration curves resemble distribution functions and are normalized. The moments of these first derivative functions are computed. It is shown that the third and fourth moments, measures of asymmetry and kurtosis, respectively, are good measures of the deviations from normal behavior. Results are presented for 44 different enzymes. Detailed results are given for 4 enzymes with 4 different types of chemistry: arginine kinase from Limulus polyphemus (horseshoe crab); β‐lactamase from Escherichia coli; glutamate racemase from Aquifex pyrophilus; and 3‐isopropylmalate dehydrogenase from Thiobacillus ferrooxidans. The relationship between the statistical measures of nonsigmoidal behavior in the predicted titration curves and the catalytic activity of the residue is discussed. Proteins 2005. © 2005 Wiley‐Liss, Inc.

[1]  J. Wyman,et al.  LINKED FUNCTIONS AND RECIPROCAL EFFECTS IN HEMOGLOBIN: A SECOND LOOK. , 1964, Advances in protein chemistry.

[2]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[3]  J. Warwicker,et al.  Calculation of the electric potential in the active site cleft due to alpha-helix dipoles. , 1982, Journal of molecular biology.

[4]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[5]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[6]  S J Gill,et al.  Binding capacity: cooperativity and buffering in biopolymers. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[7]  W. Beyer CRC Standard Mathematical Tables and Formulae , 1991 .

[8]  M. Karplus,et al.  Multiple-site titration curves of proteins: an analysis of exact and approximate methods for their calculation , 1991 .

[9]  K. Tomoo,et al.  Crystal structure of papain-succinyl-Gln-Val-Val-Ala-Ala-p-nitroanilide complex at 1.7-A resolution: noncovalent binding mode of a common sequence of endogenous thiol protease inhibitors. , 1992, Biochemistry.

[10]  D. Bashford,et al.  Electrostatic calculations of the pKa values of ionizable groups in bacteriorhodopsin , 1992 .

[11]  T. Oshima,et al.  Tyr‐139 in Thermus thermophilus 3‐isopropylmalate dehydrogenase is involved in catalytic function , 1993, FEBS letters.

[12]  J M Masson,et al.  Crystal structure of Escherichia coli TEM1 β‐lactamase at 1.8 Å resolution , 1993, Proteins.

[13]  K. Sharp,et al.  On the calculation of pKas in proteins , 1993, Proteins.

[14]  M. Gilson Multiple‐site titration and molecular modeling: Two rapid methods for computing energies and forces for ionizable groups in proteins , 1993, Proteins.

[15]  B. Honig,et al.  Environmental effects on the protonation states of active site residues in bacteriorhodopsin. , 1994, Biophysical journal.

[16]  M. Gilson,et al.  Prediction of pH-dependent properties of proteins. , 1994, Journal of molecular biology.

[17]  P. Beroza,et al.  Electrostatic calculations of amino acid titration and electron transfer, Q-AQB-->QAQ-B, in the reaction center. , 1995, Biophysical journal.

[18]  A. Karshikoff A simple algorithm for the calculation of multiple site titration curves. , 1995, Protein engineering.

[19]  L. R. Scott,et al.  Electrostatics and diffusion of molecules in solution: simulations with the University of Houston Brownian dynamics program , 1995 .

[20]  M. James,et al.  Crystal structure of human pepsin and its complex with pepstatin , 1995, Protein science : a publication of the Protein Society.

[21]  R. Hubbard,et al.  The X-ray crystal structure of phosphomannose isomerase from Candida albicans at 1.7 Å resolution , 1996, Nature Structural Biology.

[22]  Michael K. Gilson,et al.  Computing ionization states of proteins with a detailed charge model , 1996, J. Comput. Chem..

[23]  M. Gilson,et al.  The determinants of pKas in proteins. , 1996, Biochemistry.

[24]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[25]  M. Gilson,et al.  Computing ionization states of proteins with a detailed charge model , 1996, J. Comput. Chem..

[26]  B Velan,et al.  The Architecture of Human Acetylcholinesterase Active Center Probed by Interactions with Selected Organophosphate Inhibitors , 1996, The Journal of Biological Chemistry.

[27]  N. Allewell,et al.  Substrate-induced conformational change in a trimeric ornithine transcarbamoylase. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[28]  M. L. Jones,et al.  PDBsum: a Web-based database of summaries and analyses of all PDB structures. , 1997, Trends in biochemical sciences.

[29]  T. Prangé,et al.  Crystal Structure of the protein drug urate oxidase-inhibitor complex at 2.05 Å resolution , 1997, Nature Structural Biology.

[30]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[31]  E. Alexov,et al.  Incorporating protein conformational flexibility into the calculation of pH-dependent protein properties. , 1997, Biophysical journal.

[32]  P. Nordlund,et al.  Crystal structure of common type acylphosphatase from bovine testis. , 1997, Structure.

[33]  M. S. Chapman,et al.  Transition state structure of arginine kinase: implications for catalysis of bimolecular reactions. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[34]  K Namba,et al.  Structure of 3-isopropylmalate dehydrogenase in complex with 3-isopropylmalate at 2.0 A resolution: the role of Glu88 in the unique substrate-recognition mechanism. , 1998, Structure.

[35]  P D Karp,et al.  What we do not know about sequence analysis and sequence databases. , 1998, Bioinformatics.

[36]  M. Helmer-Citterich,et al.  Three-dimensional profiles: a new tool to identify protein surface similarities. , 1998, Journal of molecular biology.

[37]  J. Briggs,et al.  Calculation of the pKa values for the ligands and side chains of Escherichia coli D-alanine:D-alanine ligase. , 1999, Journal of medicinal chemistry.

[38]  D. Wilson,et al.  The role of Mg2+ and specific amino acid residues in the catalytic reaction of the major human abasic endonuclease: new insights from EDTA-resistant incision of acyclic abasic site analogs and site-directed mutagenesis. , 1999, Journal of molecular biology.

[39]  Honggao Yan,et al.  Crystal structure of 6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase, a potential target for the development of novel antimicrobial agents. , 1999, Structure.

[40]  F Guarnieri,et al.  A self-consistent, microenvironment modulated screened coulomb potential approximation to calculate pH-dependent electrostatic effects in proteins. , 1999, Biophysical journal.

[41]  C. Gray,et al.  The structure and function of the 6-hydroxymethyl-7,8-dihydropterin pyrophosphokinase from Haemophilus influenzae. , 1999, Journal of molecular biology.

[42]  Yunje Cho,et al.  Structure and mechanism of glutamate racemase from Aquifex pyrophilus , 1999, Nature Structural Biology.

[43]  Single-turnover analysis of mutant human apurinic/apyrimidinic endonuclease. , 1999, Biochemistry.

[44]  G Klebe,et al.  Improving macromolecular electrostatics calculations. , 1999, Protein engineering.

[45]  A. Valencia,et al.  Practical limits of function prediction , 2000, Proteins.

[46]  B. Atanasov,et al.  Protonation of the beta-lactam nitrogen is the trigger event in the catalytic action of class A beta-lactamases. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[47]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[48]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[49]  M S Chapman,et al.  Induced fit in arginine kinase. , 2000, Biophysical journal.

[50]  R. Pickersgill,et al.  Germin is a manganese containing homohexamer with oxalate oxidase and superoxide dismutase activities , 2000, Nature Structural Biology.

[51]  C. Schiffer,et al.  How does a symmetric dimer recognize an asymmetric substrate? A substrate complex of HIV-1 protease. , 2000, Journal of molecular biology.

[52]  M. Shoham,et al.  Crystal structure of colicin E3: implications for cell entry and ribosome inactivation. , 2001, Molecular cell.

[53]  M. Sternberg,et al.  Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. , 2001, Journal of molecular biology.

[54]  M. Ondrechen,et al.  THEMATICS: A simple computational predictor of enzyme function from structure , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[55]  D. Case,et al.  A novel view of pH titration in biomolecules. , 2001, Biochemistry.

[56]  F. Rojo,et al.  A Mutation in the C-terminal domain of the RNA polymerase alpha subunit that destabilizes the open complexes formed at the phage phi 29 late A3 promoter. , 2001, Journal of molecular biology.

[57]  D. Eisenberg,et al.  Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. , 2001, Journal of molecular biology.

[58]  A. Tropsha,et al.  Four-body potentials reveal protein-specific correlations to stability changes caused by hydrophobic core mutations. , 2001, Journal of molecular biology.

[59]  A. Elcock Prediction of functionally important residues based solely on the computed energetics of protein structure. , 2001, Journal of molecular biology.

[60]  A. Warshel,et al.  What are the dielectric “constants” of proteins and how to validate electrostatic models? , 2001, Proteins.

[61]  Itay Mayrose,et al.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues , 2002, ISMB.

[62]  Gail J. Bartlett,et al.  Analysis of catalytic residues in enzyme active sites. , 2002, Journal of molecular biology.

[63]  M. Ondrechen THEMATICS as a Tool for Functional Genomics , 2002 .

[64]  E. Alexov,et al.  Combining conformational flexibility and continuum electrostatics for calculating pK(a)s in proteins. , 2002, Biophysical journal.

[65]  Tal Pupko,et al.  A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families , 2002, Bioinform..

[66]  Gail J. Bartlett,et al.  Using a neural network and spatial clustering to predict the location of active sites in enzymes. , 2003, Journal of molecular biology.

[67]  M. Ondrechen,et al.  Protein structure to function: insights from computation , 2004, Cellular and Molecular Life Sciences CMLS.

[68]  A. Valencia,et al.  Automatic methods for predicting functionally important residues. , 2003, Journal of molecular biology.

[69]  K. Nishikawa,et al.  Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. , 2003, Journal of molecular biology.

[70]  L. Kavraki,et al.  An accurate, sensitive, and scalable method to identify functional sites in protein structures. , 2003, Journal of molecular biology.

[71]  Pengyu Y. Ren,et al.  Polarizable Atomic Multipole Water Model for Molecular Mechanics Simulation , 2003 .

[72]  Ihsan A. Shehadi,et al.  Future directions in protein function prediction , 2002, Molecular Biology Reports.

[73]  Ying Wei,et al.  Physicochemical Methods for Prediction of Functional Information for Proteins , 2004 .

[74]  E. Jakobsson,et al.  Ionization states of residues in OmpF and mutants: effects of dielectric constant and interactions between residues. , 2004, Biophysical journal.

[75]  C. Innis,et al.  Prediction of functional sites in proteins using conserved functional group analysis. , 2004, Journal of molecular biology.

[76]  Lubert Stryer,et al.  Protein structure and function , 2005, Experientia.