Diagnosis of multiple cancer types by shrunken centroids of gene expression

We have devised an approach to cancer class prediction from gene expression profiling, based on an enhancement of the simple nearest prototype (centroid) classifier. We shrink the prototypes and hence obtain a classifier that is often more accurate than competing methods. Our method of “nearest shrunken centroids” identifies subsets of genes that best characterize each class. The technique is general and can be used in many other classification problems. To demonstrate its effectiveness, we show that the method was highly efficient in finding genes for classifying small round blue cell tumors and leukemias.

[1]  R. Droste,et al.  Desmin is a specific marker for rhabdomyosarcomas of human and rat origin. , 1985, The American journal of pathology.

[2]  H. Kovar,et al.  Overexpression of the pseudoautosomal gene MIC2 in Ewing's sarcoma and peripheral primitive neuroectodermal tumor. , 1990, Oncogene.

[3]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[4]  A. Gown,et al.  Expression of myogenic regulatory proteins (myogenin and MyoD1) in small blue round cell tumors of childhood. , 1995, The American journal of pathology.

[5]  G Bussolati,et al.  Neuroendocrine Differentiation in Ewing's Sarcomas and Primitive Neuroectodermal Tumors Revealed by Reverse Transcriptase‐Polymerase Chain Reaction of Chromogranin mRNA , 1998, Diagnostic molecular pathology : the American journal of surgical pathology, part B.

[6]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[8]  A. Folpe,et al.  Immunohistochemical Detection of FLI-1 Protein Expression: A Study of 132 Round Cell Tumors With Emphasis on CD99-Positive Mimics of Ewing's Sarcoma/Primitive Neuroectodermal Tumor , 2000, The American journal of surgical pathology.

[9]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[11]  S. Gould,et al.  Detection of the PAX3-FKHR fusion gene in paediatric rhabdomyosarcoma: a reproducible predictor of outcome? , 2001, British Journal of Cancer.

[12]  M Schwab,et al.  N‐myc enhances the expression of a large set of genes functioning in ribosome biogenesis and protein synthesis , 2001, The EMBO journal.

[13]  E. Dougherty,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .