Found In Translation: a machine learning model for mouse-to-human inference

Cross-species differences form barriers to translational research that ultimately hinder the success of clinical trials, yet knowledge of species differences has yet to be systematically incorporated in the interpretation of animal models. Here we present Found In Translation (FIT; http://www.mouse2man.org), a statistical methodology that leverages public gene expression data to extrapolate the results of a new mouse experiment to expression changes in the equivalent human condition. We applied FIT to data from mouse models of 28 different human diseases and identified experimental conditions in which FIT predictions outperformed direct cross-species extrapolation from mouse results, increasing the overlap of differentially expressed genes by 20–50%. FIT predicted novel disease-associated genes, an example of which we validated experimentally. FIT highlights signals that may otherwise be missed and reduces false leads, with no experimental cost.The machine learning approach FIT leverages public mouse and human expression data to improve the translation of mouse model results to analogous human disease.

[1]  Jianzhi Zhang,et al.  Null mutations in human and mouse orthologs frequently result in different phenotypes , 2008, Proceedings of the National Academy of Sciences.

[2]  T. Miyakawa,et al.  Genomic responses in mouse models poorly mimic human inflammatory diseases , 2013 .

[3]  Robert Petryszak,et al.  ArrayExpress update—simplifying data submissions , 2014, Nucleic Acids Res..

[4]  K. Becker,et al.  Analysis of microarray data using Z score transformation. , 2003, The Journal of molecular diagnostics : JMD.

[5]  D. Koller,et al.  Conservation and divergence in the transcriptional programs of the human and mouse immune systems , 2013, Proceedings of the National Academy of Sciences.

[6]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[7]  Sean R. Davis,et al.  SRAdb: query and use public next-generation sequencing data from within R , 2013, BMC Bioinformatics.

[8]  W. Haining,et al.  Normalizing the environment recapitulates adult human immune traits in laboratory mice , 2016, Nature.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  C. Hughes,et al.  Of Mice and Not Men: Differences between Mouse and Human Immunology , 2004, The Journal of Immunology.

[11]  S. Aratani,et al.  Activation of synoviolin promoter in rheumatoid synovial cells by a novel transcription complex of interleukin enhancer binding factor 3 and GA binding protein alpha. , 2009, Arthritis and rheumatism.

[12]  P. Bugelski,et al.  Concordance of preclinical and clinical pharmacology and toxicology of therapeutic monoclonal antibodies and fusion proteins: cell surface targets , 2012, British journal of pharmacology.

[13]  P. Rahman,et al.  Genetic, Epigenetic and Pharmacogenetic Aspects of Psoriasis and Psoriatic Arthritis. , 2015, Rheumatic diseases clinics of North America.

[14]  H. Parkinson,et al.  Large scale comparison of global gene expression patterns in human and mouse , 2010, Genome Biology.

[15]  J. Seok,et al.  Evidence-Based Translation for the Genomic Responses of Murine Models for the Study of Human Immunity , 2015, PloS one.

[16]  Djordje Djordjevic,et al.  XGSA: A statistical method for cross-species gene set analysis , 2016, Bioinform..

[17]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[18]  M. Wiles,et al.  Generation of improved humanized mouse models for human infectious diseases , 2014, Journal of Immunological Methods.

[19]  Anushya Muruganujan,et al.  PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements , 2016, Nucleic Acids Res..

[20]  M. Kamm,et al.  The dendritic cell: its role in intestinal inflammation and relationship with gut bacteria , 2003, Gut.

[21]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[22]  Sean R. Davis,et al.  GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor , 2007, Bioinform..

[23]  Purvesh Khatri,et al.  Genome-wide expression for diagnosis of pulmonary tuberculosis: a multicohort analysis. , 2016, The Lancet. Respiratory medicine.

[24]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[25]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[26]  T. Hünig The storm has cleared: lessons from the CD28 superagonist TGN1412 trial , 2012, Nature Reviews Immunology.

[27]  S. Fisher,et al.  Stromal cell derived factor-2 (Sdf2): a novel protein expressed in mouse. , 2014, The international journal of biochemistry & cell biology.

[28]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[29]  N. Geifman,et al.  The Mouse Age Phenome Knowledgebase and Disease-Specific Inter-Species Age Mapping , 2013, PloS one.

[30]  H. Friess,et al.  Nerve growth factor and Trk high affinity receptor (TrkA) gene expression in inflammatory bowel disease , 2000, Gut.

[31]  Sunmo Yang,et al.  MORPHIN: a web tool for human disease research by projecting model organism biology onto a human integrated gene network , 2014, Nucleic Acids Res..

[32]  Alex Boussioutas,et al.  Cross-validation of survival associated biomarkers in gastric cancer using transcriptomic data of 1,065 patients , 2016, Oncotarget.

[33]  Lior Pachter,et al.  Differential analysis of RNA-seq incorporating quantification uncertainty , 2016, Nature Methods.

[34]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease , 2014, Nucleic Acids Res..

[35]  D. Granger,et al.  Hypercoagulability and Platelet Abnormalities in Inflammatory Bowel Disease , 2015, Seminars in Thrombosis & Hemostasis.

[36]  Ziv Bar-Joseph,et al.  ModuleBlast: identifying activated sub-networks within and across species , 2014, Nucleic acids research.