Insights into Protein Sequence and Structure-Derived Features Mediating 3D Domain Swapping Mechanism using Support Vector Machine Based Approach

3-dimensional domain swapping is a mechanism where two or more protein molecules form higher order oligomers by exchanging identical or similar subunits. Recently, this phenomenon has received much attention in the context of prions and neurodegenerative diseases, due to its role in the functional regulation, formation of higher oligomers, protein misfolding, aggregation etc. While 3-dimensional domain swap mechanism can be detected from three-dimensional structures, it remains a formidable challenge to derive common sequence or structural patterns from proteins involved in swapping. We have developed a SVM-based classifier to predict domain swapping events using a set of features derived from sequence and structural data. The SVM classifier was trained on features derived from 150 proteins reported to be involved in 3D domain swapping and 150 proteins not known to be involved in swapped conformation or related to proteins involved in swapping phenomenon. The testing was performed using 63 proteins from the positive dataset and 63 proteins from the negative dataset. We obtained 76.33% accuracy from training and 73.81% accuracy from testing. Due to high diversity in the sequence, structure and functions of proteins involved in domain swapping, availability of such an algorithm to predict swapping events from sequence and structure-derived features will be an initial step towards identification of more putative proteins that may be involved in swapping or proteins involved in deposition disease. Further, the top features emerging in our feature selection method may be analysed further to understand their roles in the mechanism of domain swapping.

[1]  Gerry McDermott,et al.  The structure of Escherichia coli cytosine deaminase. , 2002, Journal of molecular biology.

[2]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[3]  P. Suganthan,et al.  Identification of catalytic residues from protein structure using support vector machine with sequence and structural features. , 2008, Biochemical and biophysical research communications.

[4]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[5]  Graeme Milligan,et al.  Domain Swapping in the Human Histamine H1 Receptor , 2004, Journal of Pharmacology and Experimental Therapeutics.

[6]  D. Mengin-Lecreulx,et al.  The Crystal Structures of Apo and Complexed Saccharomyces cerevisiae GNA1 Shed Light on the Catalytic Mechanism of an Amino-sugar N-Acetyltransferase* , 2001, The Journal of Biological Chemistry.

[7]  D. Eisenberg,et al.  Domain swapping: entangling alliances between proteins. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Moro,et al.  Elucidation of the ribonuclease A aggregation process mediated by 3D domain swapping: a computational approach reveals possible new multimeric structures. , 2008, Biopolymers.

[9]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[10]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[11]  David Eisenberg,et al.  Refined structure of dimeric diphtheria toxin at 2.0 Å resolution , 1994, Protein science : a publication of the Protein Society.

[12]  L. Wyns,et al.  Trimeric domain-swapped barnase. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  H. Bernstein Recent changes to RasMol, recombining the variants. , 2000, Trends in biochemical sciences.

[14]  D Eisenberg,et al.  The crystal structure of a 3D domain-swapped dimer of RNase A at a 2.1-A resolution. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Angela M Gronenborn,et al.  Protein acrobatics in pairs--dimerization via domain swapping. , 2009, Current opinion in structural biology.

[16]  T A Jones,et al.  Crystal structure of human glyoxalase I—evidence for gene duplication and 3D domain swapping , 1997, EMBO Journal.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  B D Hammock,et al.  Binding of Alkylurea Inhibitors to Epoxide Hydrolase Implicates Active Site Tyrosines in Substrate Activation* , 2000, The Journal of Biological Chemistry.

[19]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[20]  Ramanathan Sowdhamini,et al.  DIAL: a web-based server for the automatic identification of structural domains in proteins , 2005, Nucleic Acids Res..

[21]  M. Jaskólski,et al.  3D domain swapping, protein oligomerization, and amyloid formation. , 2001, Acta biochimica Polonica.

[22]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[23]  Valerie Daggett,et al.  Insight into ribonuclease A domain swapping by molecular dynamics unfolding simulations. , 2005, Biochemistry.

[24]  J. Tainer,et al.  Human CksHs2 atomic structure: a role for its hexameric assembly in cell cycle control. , 1993, Science.

[25]  Ponnuthurai N. Suganthan,et al.  A machine learning approach for the identification of odorant binding proteins from sequence-derived properties , 2007, BMC Bioinformatics.

[26]  K. Chan,et al.  Study of goldfish (Carassius auratus) growth hormone structure-function relationship by domain swapping. , 2007, Comparative biochemistry and physiology. Part B, Biochemistry & molecular biology.

[27]  Charlotte M. Deane,et al.  JOY: protein sequence-structure representation and analysis , 1998, Bioinform..

[28]  David Eisenberg,et al.  3D domain swapping: As domains continue to swap , 2002, Protein science : a publication of the Protein Society.

[29]  Linda Thöny-Meyer,et al.  Helix swapping leads to dimerization of the N‐terminal domain of the c‐type cytochrome maturation protein CcmH from Escherichia coli , 2008, FEBS letters.

[30]  M. Newcomer,et al.  Protein folding and three-dimensional domain swapping: a strained relationship? , 2002, Current opinion in structural biology.

[31]  David Eisenberg,et al.  Deposition diseases and 3D domain swapping. , 2006, Structure.

[32]  P. Pelosi,et al.  Odorant-binding proteins. , 1994, Critical reviews in biochemistry and molecular biology.

[33]  Christopher A Reynolds,et al.  Dimerization and Domain Swapping in G-Protein-Coupled Receptors: A Computational Study , 2000, Neuropsychopharmacology.

[34]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[35]  Yin Han,et al.  Crystal structures of 2-methylisocitrate lyase in complex with product and with isocitrate inhibitor provide insight into lyase substrate specificity, catalysis and evolution. , 2005, Biochemistry.

[36]  Jorge Chahine,et al.  Computational studies of the reversible domain swapping of p13suc1. , 2005, Biophysical journal.

[37]  N. Nagradova,et al.  Three-Dimensional Domain Swapping in Homooligomeric Proteins and Its Functional Significance , 2002, Biochemistry (Moscow).

[38]  Dong Hee Kim,et al.  Bi-functional activities of chimeric lysozymes constructed by domain swapping between bacteriophage T7 and K11 lysozymes. , 2007, Journal of biochemistry and molecular biology.

[39]  Ian R. Booth,et al.  A Mechanism of Regulating Transmembrane Potassium Flux through a Ligand-Mediated Conformational Switch , 2002, Cell.

[40]  L. Vitagliano,et al.  Binding of a substrate analog to a domain swapping protein: X‐ray structure of the complex of bovine seminal ribonuclease with uridylyl(2′,5′)adenosine , 1998, Protein science : a publication of the Protein Society.

[41]  Hsin-Yi Lin,et al.  Molecular Dynamics Simulations of Human Cystatin C and Its L68Q Varient to Investigate the Domain Swapping Mechanism , 2007, Journal of biomolecular structure & dynamics.

[42]  Kuo-Chen Chou,et al.  Predicting the affinity of epitope-peptides with class I MHC molecule HLA-A*0201: an application of amino acid-based peptide prediction. , 2007, Protein engineering, design & selection : PEDS.

[43]  David Eisenberg,et al.  Novel subunit—subunit interactions in the structure of glutamine synthetase , 1986, Nature.

[44]  Mariella Tegoni,et al.  Control of domain swapping in bovine odorant-binding protein. , 2002, The Biochemical journal.

[45]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[46]  C B Anfinsen,et al.  The formation and stabilization of protein structure. , 1972, The Biochemical journal.

[47]  Mark A Griep,et al.  Domain swapping reveals that the C‐ and N‐terminal domains of DnaG and DnaB, respectively, are functional homologues , 2007, Molecular microbiology.

[48]  K. Back,et al.  Cloning of a Sesquiterpene Cyclase and Its Functional Expression by Domain Swapping Strategy , 2000, Molecules and cells.

[49]  V. Daggett,et al.  Characterization of the unfolding pathway of the cell-cycle protein p13suc1 by molecular dynamics simulations: implications for domain swapping. , 2000, Structure.

[50]  Kuo-Chen Chou,et al.  Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network. , 2007, Protein and peptide letters.

[51]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[52]  David Eisenberg,et al.  Runaway domain swapping in amyloid-like fibrils of T7 endonuclease I. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[53]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[54]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[55]  Alexander Wlodawer,et al.  Structures of the Complexes of a Potent Anti-HIV Protein Cyanovirin-N and High Mannose Oligosaccharides* , 2002, The Journal of Biological Chemistry.

[56]  S. Khare,et al.  Molecular mechanisms of polypeptide aggregation in human diseases. , 2007, Current protein & peptide science.

[57]  Marie desJardins,et al.  An interactive visualization tool to explore the biophysical properties of amino acids and their contribution to substitution matrices , 2006, BMC Bioinformatics.

[58]  D Eisenberg,et al.  3D domain swapping: A mechanism for oligomer assembly , 1995, Protein science : a publication of the Protein Society.

[59]  Roman A. Laskowski,et al.  Enhancing the functional annotation of PDB structures in PDBsum using key figures extracted from the literature , 2007, Bioinform..

[60]  J. Chappell,et al.  Identifying functional domains within terpene cyclases using a domain-swapping strategy. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[61]  D. Balciunas,et al.  Evidence of domain swapping within the jumonji family of transcription factors. , 2000, Trends in biochemical sciences.

[62]  John J. Barker,et al.  Engineering an intertwined form of CD2 for stability and assembly , 1998, Nature Structural Biology.

[63]  Yogendra Sharma,et al.  Three-dimensional domain swapping in nitrollin, a single-domain betagamma-crystallin from Nitrosospira multiformis, controls protein conformation and stability but not dimerization. , 2009, Journal of molecular biology.

[64]  M. Newcomer,et al.  Protein folding and three-dimensional domain swapping: astrained relationship? , 2002 .

[65]  Peter G Wolynes,et al.  Overcoming residual frustration in domain-swapping: the roles of disulfide bonds in dimerization and aggregation , 2005, Physical biology.

[66]  Chi-Hung Huang,et al.  Molecular Dynamics Simulations To Investigate the Domain Swapping Mechanism of Human Cystatin C , 2007, Biotechnology progress.

[67]  Brian A. Hemmings,et al.  Domain Swapping Used To Investigate the Mechanism of Protein Kinase B Regulation by 3-Phosphoinositide-Dependent Protein Kinase 1 and Ser473 Kinase , 1999, Molecular and Cellular Biology.

[68]  David Eisenberg,et al.  The evolving role of 3D domain swapping in proteins. , 2004, Structure.