Graph Evolution via Social Diffusion Processes

We present a new stochastic process, called as Social Diffusion Process (SDP), to address the graph modeling. Based on this model, we derive a graph evolution algorithm and a series of graphbased approaches to solve machine learning problems, including clustering and semi-supervised learning. SDP can be viewed as a special case of Matthew effect, which is a general phenomenon in nature and societies. We use social event as a metaphor of the intrinsic stochastic process for broad range of data. We evaluate our approaches in a large number of frequently used datasets and compare our approaches to other state-of-the-art techniques. Results show that our algorithm outperforms the existing methods in most cases. We also applying our algorithm into the functionality analysis of microRNA and discover biologically interesting cliques. Due to the broad availability of graph-based data, our new model and algorithm potentially have applications in wide range.

[1]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[2]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[3]  H. Horvitz,et al.  The let-7 MicroRNA family members mir-48, mir-84, and mir-241 function together to regulate developmental timing in Caenorhabditis elegans. , 2005, Developmental cell.

[4]  Anupam Gupta,et al.  Ultra-low-dimensional embeddings for doubling metrics , 2008, SODA '08.

[5]  Kylie L. Gorringe,et al.  Genetic Analysis of Cancer-Implicated MicroRNA in Ovarian Cancer , 2008, Clinical Cancer Research.

[6]  S. vanDongen Graph Clustering by Flow Simulation , 2000 .

[7]  Martine D. F. Schlag,et al.  Spectral K-Way Ratio-Cut Partitioning and Clustering , 1993, 30th ACM/IEEE Design Automation Conference.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[10]  Lancelot F. James,et al.  Generalized weighted Chinese restaurant processes for species sampling mixture models , 2003 .

[11]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[12]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[13]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[14]  M. Rossiter The Matthew Matilda Effect in Science , 1993 .

[15]  Jun Liu,et al.  MicroRNA-98 and let-7 Confer Cholangiocyte Expression of Cytokine-Inducible Src Homology 2-Containing Protein in Response to Microbial Challenge12 , 2009, The Journal of Immunology.

[16]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[17]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[18]  K. Stanovich Matthew Effects in Reading: Some Consequences of Individual Differences in the Acquisition of Literacy , 2009 .

[19]  P. Morin,et al.  MicroRNAs in ovarian carcinomas. , 2010, Endocrine-related cancer.

[20]  R. Merton The Matthew Effect in Science , 1968, Science.

[21]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[22]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[23]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[24]  Leonard D. Goldstein,et al.  MicroRNA expression profiling of human breast cancer identifies new markers of tumor subtype , 2007, Genome Biology.

[25]  Pat Langley,et al.  Generalized clustering, supervised learning, and data assignment , 2001, KDD '01.

[26]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[27]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[28]  Eugene Berezikov,et al.  Approaches to microRNA discovery , 2006, Nature Genetics.

[29]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Brendan J. Frey,et al.  Comparing Sequence and Expression for Predicting microRNA Targets Using GenMIR3 , 2007, Pacific Symposium on Biocomputing.

[31]  Santosh K. Mishra,et al.  De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures , 2007, Bioinform..