Capsule Network for Predicting RNA-Protein Binding Preferences Using Hybrid Feature

RNA-Protein binding is involved in many different biological processes. With the progress of technology, more and more data are available for research. Based on these data, many prediction methods have been proposed to predict RNA-Protein binding preference. Some of these methods use only RNA sequence features for prediction, and some methods use multiple features for prediction. But, the performance of these methods is not satisfactory. In this study, we propose an improved capsule network to predict RNA-protein binding preferences, which can use both RNA sequence features and structure features. Experimental results show that our proposed method iCapsule performs better than three baseline methods in this field. We used both RNA sequence features and structure features in the model, so we tested the effect of primary capsule layer changes on model performance. In addition, we also studied the impact of model structure on model performance by performing our proposed method with different number of convolution layers and different kernel sizes.

[1]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[2]  De-Shuang Huang,et al.  A Two-Stage Geometric Method for Pruning Unreliable Links in Protein-Protein Networks , 2015, IEEE Transactions on NanoBioscience.

[3]  Kaibiao Xu,et al.  AKAP3 Synthesis Is Mediated by RNA Binding Proteins and PKA Signaling During Mouse Spermiogenesis1 , 2014, Biology of reproduction.

[4]  Zhu-Hong You,et al.  t-LSE: A Novel Robust Geometric Approach for Modeling Protein-Protein Interaction Networks , 2013, PloS one.

[5]  P. S. Ray,et al.  Interplay between RNA-binding protein HuR and microRNA-125b regulates p53 mRNA translation in response to genotoxic stress , 2016, RNA biology.

[6]  Robert Giegerich,et al.  RNAshapes: an integrated RNA analysis package based on abstract shapes. , 2006, Bioinformatics.

[7]  Wang Ling,et al.  Generative and Discriminative Text Classification with Recurrent Neural Networks , 2017, ArXiv.

[8]  De-Shuang Huang,et al.  Pupylation sites prediction with ensemble classification model , 2017, Int. J. Data Min. Bioinform..

[9]  Martin Vingron,et al.  Translational regulation shapes the molecular landscape of complex disease phenotypes , 2015, Nature Communications.

[10]  T. Steitz,et al.  Structural insights into the role of rRNA modifications in protein synthesis and ribosome assembly , 2015, Nature Structural &Molecular Biology.

[11]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[12]  Zhen Wang,et al.  SFAPS: An R package for structure/function analysis of protein sequences based on informational spectrum method , 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine.

[13]  De-Shuang Huang,et al.  Direct AUC optimization of regulatory motifs , 2017, Bioinform..

[14]  Kyungsook Han,et al.  miRNA-Disease Association Prediction with Collaborative Matrix Factorization , 2017, Complex..

[15]  Jie Wang,et al.  Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium , 2012, Nucleic Acids Res..

[16]  J. Pelletier,et al.  2′,3′‐Cyclic nucleotide 3′‐phosphodiesterase: A novel RNA‐binding protein that inhibits protein synthesis , 2009, Journal of neuroscience research.

[17]  T. Glisovic,et al.  RNA‐binding proteins and post‐transcriptional gene regulation , 2008, FEBS letters.

[18]  Lei Zhang,et al.  Tumor Clustering Using Nonnegative Matrix Factorization With Gene Selection , 2009, IEEE Transactions on Information Technology in Biomedicine.

[19]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[20]  Marinka Zitnik,et al.  Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins , 2016, Bioinform..

[21]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[22]  Robert Giegerich,et al.  The RNA shapes studio , 2014, Bioinform..

[23]  R. Backofen,et al.  GraphProt: modeling binding preferences of RNA-binding proteins , 2014, Genome Biology.

[24]  De-Shuang Huang,et al.  High-Order Convolutional Neural Network Architecture for Predicting DNA-Protein Binding Sites , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Quaid Morris,et al.  RNAcontext: A New Method for Learning the Sequence and Structure Binding Preferences of RNA-Binding Proteins , 2010, PLoS Comput. Biol..

[26]  De-Shuang Huang,et al.  Weakly-Supervised Convolutional Neural Network Architecture for Predicting Protein-DNA Binding , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[28]  De-Shuang Huang,et al.  Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks , 2015, BMC Genomics.

[29]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[30]  D.-S. Huang,et al.  Radial Basis Probabilistic Neural Networks: Model and Application , 1999, Int. J. Pattern Recognit. Artif. Intell..

[31]  Brendan J. Frey,et al.  A compendium of RNA-binding motifs for decoding gene regulation , 2013, Nature.

[32]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[33]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[34]  De-Shuang Huang,et al.  Predicting Hub Genes Associated with Cervical Cancer through Gene Co-Expression Networks , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[35]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[36]  Enrico Blanzieri,et al.  Protein-specific prediction of mRNA binding using RNA sequences, binding motifs and predicted secondary structures , 2014, BMC Bioinformatics.

[37]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[38]  D. W. Staple,et al.  Pseudoknots: RNA Structures with Diverse Functions , 2005, PLoS biology.

[39]  Wangmeng Zuo,et al.  Learning Deep CNN Denoiser Prior for Image Restoration , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  M. Bushell,et al.  Translational regulation of gene expression during conditions of cell stress. , 2010, Molecular cell.

[41]  John S. Mattick,et al.  The RNA modification landscape in human disease , 2017, RNA.

[42]  Mark D. Biggin,et al.  Statistics requantitates the central dogma , 2015, Science.

[43]  Tao Shen,et al.  DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[44]  M. Gorospe,et al.  RNA-binding protein HuR enhances p53 translation in response to ultraviolet light irradiation , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[45]  De-Shuang Huang,et al.  ChIP-PIT: Enhancing the Analysis of ChIP-Seq Data Using Convex-Relaxed Pair-Wise Interaction Tensor Decomposition , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[46]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.

[47]  Lei Zhang,et al.  Prediction of protein-protein interactions based on protein-protein correlation using least squares regression. , 2014, Current protein & peptide science.

[48]  Guohui Chuai,et al.  DeepCRISPR: optimized CRISPR guide RNA design by deep learning , 2018, Genome Biology.