Microarray Analysis of Autoimmune Diseases by Machine Learning Procedures

Microarray-based global gene expression profiling, with the use of sophisticated statistical algorithms is providing new insights into the pathogenesis of autoimmune diseases. We have applied a novel statistical technique for gene selection based on machine learning approaches to analyze microarray expression data gathered from patients with systemic lupus erythematosus (SLE) and primary antiphospholipid syndrome (PAPS), two autoimmune diseases of unknown genetic origin that share many common features. The methodology included a combination of three data discretization policies, a consensus gene selection method, and a multivariate correlation measurement. A set of 150 genes was found to discriminate SLE and PAPS patients from healthy individuals. Statistical validations demonstrate the relevance of this gene set from an univariate and multivariate perspective. Moreover, functional characterization of these genes identified an interferon-regulated gene signature, consistent with previous reports. It also revealed the existence of other regulatory pathways, including those regulated by PTEN, TNF, and BCL-2, which are altered in SLE and PAPS. Remarkably, a significant number of these genes carry E2F binding motifs in their promoters, projecting a role for E2F in the regulation of autoimmunity.

[1]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[2]  Virginia Pascual,et al.  Interferon and Granulopoiesis Signatures in Systemic Lupus Erythematosus Blood , 2003, The Journal of experimental medicine.

[3]  M. Aringer,et al.  Tumour necrosis factor and other proinflammatory cytokines in systemic lupus erythematosus: a rationale for therapeutic intervention , 2004, Lupus.

[4]  F. Tsai,et al.  Polymorphisms of TAP1 transporter genes in Chinese patients with systemic lupus erythematosus in Taiwan , 2004, Rheumatology International.

[5]  S. Koskenmies Mapping of susceptibility genes for systemic lupus erythematosus (SLE) , 2004 .

[6]  Xiaohui Liu,et al.  Consensus clustering and functional interpretation of gene-expression data , 2004, Genome Biology.

[7]  Liang Xie,et al.  Cyclophilin A Is a Proinflammatory Cytokine that Activates Endothelial Cells , 2004, Arteriosclerosis, thrombosis, and vascular biology.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  S. Iliceto,et al.  The Diagnosis of the Antiphospholipid Syndrome , 2006, Pathophysiology of Haemostasis and Thrombosis.

[10]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[11]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.

[12]  J. Harley,et al.  Meta-analysis of TNF-α promoter −308 A/G polymorphism and SLE susceptibility , 2006, European Journal of Human Genetics.

[13]  R. H. Scofield,et al.  Genetic knock out of 60kD Ro (or SSA), a common lupus autoantigen, induces lupus. , 2004, Trends in immunology.

[14]  J. Sánchez-Román,et al.  Clinical significance of anti-multiple nuclear dots/Sp100 autoantibodies. , 2003, Scandinavian journal of gastroenterology.

[15]  Richard Baumgartner,et al.  Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions , 2003, Bioinform..

[16]  C. Johansson Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1365 Exploring the Genetics of SLE with Linkage and Association Analysis , 2004 .

[17]  G. Peltz,et al.  Evidence for an interferon-inducible gene, Ifi202, in the susceptibility to systemic lupus. , 2001, Immunity.

[18]  J. Nevins,et al.  The Rb/E2F pathway and cancer. , 2001, Human molecular genetics.

[19]  Timothy W Behrens,et al.  Gene expression profiling in human autoimmunity , 2006, Immunological reviews.

[20]  D. Wallace,et al.  Dubois' Lupus Erythematosus , 1987 .

[21]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[22]  Weihong Yan,et al.  Expression-based monitoring of transcription factor activity: the TELiS database , 2005, Bioinform..

[23]  M. Greenberg,et al.  Mutation of E2F2 in mice causes enhanced T lymphocyte proliferation, leading to the development of autoimmunity. , 2001, Immunity.

[24]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[25]  J. Rathmell,et al.  Mitochondria, cell death, and B cell tolerance. , 2006, Current directions in autoimmunity.

[26]  G. Burmester,et al.  Strategies using functional genomics in rheumatic diseases. , 2004, Autoimmunity reviews.

[27]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[28]  Gopal Kanji,et al.  100 Statistical Tests , 1994 .

[29]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[30]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[31]  J. Morrison,et al.  Genome-wide screen for systemic lupus erythematosus susceptibility genes in multiplex families. , 1999, Human molecular genetics.

[32]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[33]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[34]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[35]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[36]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[37]  Jerry Li,et al.  Within the fold: assessing differential expression measures and reproducibility in microarray assays , 2002, Genome Biology.

[38]  Eibe Frank,et al.  Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms , 2004, PAKDD.

[39]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[40]  C. Mohan,et al.  PI3K/AKT signaling and systemic autoimmunity , 2005, Immunologic research.

[41]  J. Anaya,et al.  TAP1 and TAP2 polymorphisms analysis in northwestern Colombian patients with systemic lupus erythematosus , 2003, Annals of the rheumatic diseases.

[42]  T. Darden,et al.  Computational Analysis of Leukemia Microarray Expression Data Using the GA/KNN Method , 2002 .

[43]  Ian Witten,et al.  Data Mining , 2000 .

[44]  Jason Catlett,et al.  On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[45]  H. Hsu,et al.  Tumor necrosis factor ligand-receptor superfamily and arthritis. , 2006, Current directions in autoimmunity.

[46]  R. Watson,et al.  Drug-induced, Ro/SSA-positive cutaneous lupus erythematosus. , 2003, Archives of dermatology.

[47]  Luis Soares,et al.  An array of possibilities for the study of autoimmunity , 2005, Nature.

[48]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[49]  D. Feng,et al.  IEEE transactions on information technology in biomedicine: special issue on advances in clinical and health-care knowledge management , 2005 .

[50]  P. Gregersen,et al.  Genetics of autoimmune diseases — disorders of immune homeostasis , 2006, Nature Reviews Genetics.

[51]  Joaquín Dopazo,et al.  BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments , 2006, Nucleic Acids Res..

[52]  D. Alarcón-Segovia,et al.  Shared autoimmunity: The time has come , 2004, Current rheumatology reports.

[53]  S. Tapscott,et al.  MyoD and the transcriptional control of myogenesis. , 2005, Seminars in cell & developmental biology.

[54]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[55]  T. Horiuchi,et al.  Increased expression of membrane TNF-alpha on activated peripheral CD8+ T cells in systemic lupus erythematosus. , 2006, International journal of molecular medicine.

[56]  D. Sackett,et al.  Derivation of the SLEDAI. A disease activity index for lupus patients. The Committee on Prognosis Studies in SLE. , 1992, Arthritis and rheumatism.

[57]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[58]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..

[59]  L. A. Smith,et al.  Feature Subset Selection: A Correlation Based Filter Approach , 1997, ICONIP.

[60]  J. Harley,et al.  Meta-analysis of TNF-alpha promoter -308 A/G polymorphism and SLE susceptibility. , 2006, European journal of human genetics : EJHG.

[61]  R. Kimberly,et al.  Genetics of susceptibility and severity in systemic lupus erythematosus , 2005, Current opinion in rheumatology.

[62]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[63]  Moshe Ben-Bassat,et al.  35 Use of distance measures, information measures and error bounds in feature evaluation , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[64]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[65]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[66]  A. Achiron,et al.  Autoimmunity gene expression portrait: specific signature that intersects or differentiates between multiple sclerosis and systemic lupus erythematosus , 2004, Clinical and experimental immunology.

[67]  J. O. Thomas,et al.  Antibodies to histones in systemic lupus erythematosus: localization of prominent autoantigens on histones H1 and H2B. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[68]  D. Isenberg,et al.  Anti La(SSB) identifies a distinctive subgroup of systemic lupus erythematosus. , 1988, British journal of rheumatology.

[69]  John Quackenbush,et al.  Microarray gene expression data analysis - a beginner's guide , 2003 .

[70]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[71]  B. Tsao The genetics of human systemic lupus erythematosus. , 2003, Trends in immunology.

[72]  Jeremy Luban,et al.  Cyclophilin A Interacts with HIV-1 Vpr and Is Required for Its Functional Expression* , 2003, Journal of Biological Chemistry.

[73]  A. Krensky,et al.  Granulysin, a Cytolytic Molecule, Is Also a Chemoattractant and Proinflammatory Activator 1 , 2005, The Journal of Immunology.

[74]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[75]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[76]  G. Karypis,et al.  Interferon-inducible gene expression signature in peripheral blood cells of patients with severe lupus , 2003, Proceedings of the National Academy of Sciences of the United States of America.