A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand binding sites

BackgroundAn accurate description of protein shape derived from protein structure is necessary to establish an understanding of protein-ligand interactions, which in turn will lead to improved methods for protein-ligand docking and binding site analysis. Most current shape descriptors characterize only the local properties of protein structure using an all-atom representation and are slow to compute. We need new shape descriptors that have the ability to capture both local and global structural information, are robust for application to models and low quality structures and are computationally efficient to permit high throughput analysis of protein structures.ResultsWe introduce a new shape description that requires only the Cα atoms to represent the protein structure, thus making it both fast and suitable for use on models and low quality structures. The notion of a geometric potential is introduced to quantitatively describe the shape of the structure. This geometric potential is dependent on both the global shape of the protein structure as well as the surrounding environment of each residue. When applying the geometric potential for binding site prediction, approximately 85% of known binding sites can be accurately identified with above 50% residue coverage and 80% specificity. Moreover, the algorithm is fast enough for proteome-scale applications. Proteins with fewer than 500 amino acids can be scanned in less than two seconds.ConclusionThe reduced representation of the protein structure combined with the geometric potential provides a fast, quantitative description of protein-ligand binding sites with potential for use in large-scale predictions, comparisons and analysis.

[1]  D. Levitt,et al.  POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. , 1992, Journal of molecular graphics.

[2]  B. Honig,et al.  Classical electrostatics in biology and chemistry. , 1995, Science.

[3]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[4]  C. Frömmel,et al.  The automatic search for ligand binding sites in proteins of known three-dimensional structure using only geometric criteria. , 1996, Journal of molecular biology.

[5]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[6]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[7]  S. Jones,et al.  Analysis of protein-protein interaction sites using surface patches. , 1997, Journal of molecular biology.

[8]  Ajay N. Jain,et al.  Automatic identification and representation of protein binding sites for molecular docking , 1997, Protein science : a publication of the Protein Society.

[9]  M Hendlich,et al.  LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. , 1997, Journal of molecular graphics & modelling.

[10]  I. Kuntz,et al.  Surface solid angle-based site points for molecular docking. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[11]  H. Edelsbrunner,et al.  Anatomy of protein pockets and cavities: Measurement of binding site geometry and implications for ligand design , 1998, Protein science : a publication of the Protein Society.

[12]  H. Wolfson,et al.  Small molecule recognition: solid angles surface representation and molecular shape complementarity. , 1999, Combinatorial chemistry & high throughput screening.

[13]  Jill E. Gready,et al.  Simple method for locating possible ligand binding sites on protein surfaces , 1999, J. Comput. Chem..

[14]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[15]  P. Willett,et al.  SuperStar: improved knowledge-based interaction fields for protein binding sites. , 2001, Journal of molecular biology.

[16]  W. Richards,et al.  Identification of ligand binding sites on proteins using a multi-scale approach. , 2002, Journal of the American Chemical Society.

[17]  Ricardo L. Mancera,et al.  A new method for estimating the importance of hydrogen-bonding groups in the binding site of a protein , 2003, J. Comput. Aided Mol. Des..

[18]  Karl H. Clodfelter,et al.  Identification of substrate binding sites in enzymes by computational solvent mapping. , 2003, Journal of molecular biology.

[19]  S. J. Campbell,et al.  Ligand binding: functional site location, similarity and docking. , 2003, Current opinion in structural biology.

[20]  R. Abagyan,et al.  Comprehensive identification of "druggable" protein ligand binding sites. , 2004, Genome informatics. International Conference on Genome Informatics.

[21]  Herbert Edelsbrunner,et al.  Extreme Elevation on a 2-Manifold , 2004, SCG '04.

[22]  Gil Amitai,et al.  Network analysis of protein structures identifies functional residues. , 2004, Journal of molecular biology.

[23]  J. Thornton,et al.  Predicting protein function from sequence and structural data. , 2005, Current opinion in structural biology.

[24]  Tal Pupko,et al.  In silico identification of functional regions in proteins , 2005, ISMB.

[25]  Brian T. Sutch,et al.  Predicting protein functional sites with phylogenetic motifs , 2004, Proteins.

[26]  M. D. Kelly,et al.  A new method for estimating the importance of hydrophobic groups in the binding site of a protein. , 2005, Journal of medicinal chemistry.

[27]  D. Souvaine,et al.  An intuitive approach to measuring protein surface curvature , 2005, Proteins.

[28]  Herbert Edelsbrunner,et al.  Coarse and Reliable Geometric Alignment for Protein Docking , 2005, Pacific Symposium on Biocomputing.

[29]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[30]  R. Abagyan,et al.  Pocketome via Comprehensive Identification and Classification of Ligand Binding Envelopes* , 2005, Molecular & Cellular Proteomics.

[31]  M. Eisenstein,et al.  Looking at enzymes from the inside out: the proximity of catalytic residues to the molecular centroid can be used for detection of active sites and enzyme-ligand interfaces. , 2005, Journal of molecular biology.

[32]  M. Zacharias,et al.  Accounting for global protein deformability during protein-protein and protein-ligand docking. , 2005, Biochimica et biophysica acta.

[33]  J. Thornton,et al.  Conformational changes observed in enzyme crystal structures upon substrate binding. , 2005, Journal of molecular biology.

[34]  Qing Zhang,et al.  The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema , 2004, Nucleic Acids Res..

[35]  J. Thornton,et al.  A method for localizing ligand binding pockets in protein structures , 2005, Proteins.

[36]  Haruki Nakamura,et al.  A Virtual Active Compound Produced from the Negative Image of a Ligand-binding Pocket, and its Application to in-silico Drug Screening , 2006, J. Comput. Aided Mol. Des..

[37]  B. Honig,et al.  On the nature of cavities on protein surfaces: Application to the identification of drug‐binding sites , 2006, Proteins.

[38]  K. Sharp,et al.  Travel depth, a new shape descriptor for macromolecules: application to ligand binding. , 2006, Journal of molecular biology.

[39]  Y. Fukunishi,et al.  Classification of chemical compounds by protein-compound docking for use in designing a focused library. , 2006, Journal of medicinal chemistry.

[40]  Jie Liang,et al.  CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues , 2006, Nucleic Acids Res..

[41]  R. Nussinov,et al.  How different are structurally flexible and rigid binding sites? Sequence and structural features discriminating proteins that do and do not undergo conformational change upon ligand binding. , 2007, Journal of molecular biology.