Tissue-based map of the human proteome

Protein expression across human tissues Sequencing the human genome gave new insights into human biology and disease. However, the ultimate goal is to understand the dynamic expression of each of the approximately 20,000 protein-coding genes and the function of each protein. Uhlén et al. now present a map of protein expression across 32 human tissues. They not only measured expression at an RNA level, but also used antibody profiling to precisely localize the corresponding proteins. An interactive website allows exploration of expression patterns across the human body. Science, this issue 10.1126/science.1260419 Transcriptomics and immunohistochemistry map protein expression across 32 human tissues. INTRODUCTION Resolving the molecular details of proteome variation in the different tissues and organs of the human body would greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on quantitative transcriptomics on a tissue and organ level combined with protein profiling using microarray-based immunohistochemistry to achieve spatial localization of proteins down to the single-cell level. We provide a global analysis of the secreted and membrane proteins, as well as an analysis of the expression profiles for all proteins targeted by pharmaceutical drugs and proteins implicated in cancer. RATIONALE We have used an integrative omics approach to study the spatial human proteome. Samples representing all major tissues and organs (n = 44) in the human body have been analyzed based on 24,028 antibodies corresponding to 16,975 protein-encoding genes, complemented with RNA-sequencing data for 32 of the tissues. The antibodies have been used to produce more than 13 million tissue-based immunohistochemistry images, each annotated by pathologists for all sampled tissues. To facilitate integration with other biological resources, all data are available for download and cross-referencing. RESULTS We report a genome-wide analysis of the tissue specificity of RNA and protein expression covering more than 90% of the putative protein-coding genes, complemented with analyses of various subproteomes, such as predicted secreted proteins (n = 3171) and membrane-bound proteins (n = 5570). The analysis shows that almost half of the genes are expressed in all analyzed tissues, which suggests that the gene products are needed in all cells to maintain “housekeeping” functions such as cell growth, energy generation, and basic metabolism. Furthermore, there is enrichment in metabolism among these genes, as 60% of all metabolic enzymes are expressed in all analyzed tissues. The largest number of tissue-enriched genes is found in the testis, followed by the brain and the liver. Analysis of the 618 proteins targeted by clinically approved drugs unexpectedly showed that 30% are expressed in all analyzed tissues. An analysis of metabolic activity based on genome-scale metabolic models (GEMS) revealed liver as the most metabolically active tissue, followed by adipose tissue and skeletal muscle. CONCLUSIONS A freely available interactive resource is presented as part of the Human Protein Atlas portal (www.proteinatlas.org), offering the possibility to explore the tissue-elevated proteomes in tissues and organs and to analyze tissue profiles for specific protein classes. Comprehensive lists of proteins expressed at elevated levels in the different tissues have been compiled to provide a spatial context with localization of the proteins in the subcompartments of each tissue and organ down to the single-cell level. The human tissue–enriched proteins. All tissue-enriched proteins are shown for 13 representative tissues or groups of tissues, stratified according to their predicted subcellular localization. Enriched proteins are mainly intracellular in testis, mainly membrane bound in brain and kidney, and mainly secreted in pancreas and liver. Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray–based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.

[1]  J. Nielsen,et al.  The human liver‐specific proteome defined by transcriptomics and antibody‐based profiling , 2014, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[2]  J. Harrow,et al.  Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes , 2014, Human molecular genetics.

[3]  C. Lindskog,et al.  The human testis-specific proteome defined by transcriptomics and antibody-based profiling. , 2014, Molecular human reproduction.

[4]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[5]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[6]  J. Nielsen,et al.  Identification of anticancer drugs for hepatocellular carcinoma through personalized genome‐scale metabolic modeling , 2014, Molecular systems biology.

[7]  M. Uhlén,et al.  Genome-scale metabolic modelling of hepatocytes reveals serine deficiency in patients with non-alcoholic fatty liver disease , 2014, Nature Communications.

[8]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[9]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[10]  B. Hallström,et al.  The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling , 2014, Journal of Gastroenterology.

[11]  J. Nielsen,et al.  Analysis of the Human Tissue-specific Expression by Genome-wide Integration of Transcriptomics and Antibody-based Proteomics* , 2013, Molecular & Cellular Proteomics.

[12]  Gary D Bader,et al.  Comprehensive identification of mutational cancer driver genes across 12 tumor types , 2013, Scientific Reports.

[13]  Peter Willett,et al.  What is a tutorial , 2013 .

[14]  J. Harrow,et al.  Transcriptome analysis of human tissues and cell lines reveals one dominant transcript per gene , 2013, Genome Biology.

[15]  E. Lundberg,et al.  RNA deep sequencing as a tool for selection of cell lines for systematic subcellular localization of all human proteins. , 2013, Journal of proteome research.

[16]  A. Bairoch,et al.  neXtProt: organizing protein knowledge in the context of human proteome projects. , 2013, Journal of proteome research.

[17]  Edgar Wingender,et al.  TFClass: an expandable hierarchical classification of human transcription factors , 2012, Nucleic Acids Res..

[18]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[19]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of human colon and rectal cancer , 2012, Nature.

[20]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[21]  Caroline Kampf,et al.  Production of Tissue Microarrays, Immunohistochemistry Staining and Digitalization Within the Human Protein Atlas , 2012, Journal of visualized experiments : JoVE.

[22]  Janet M Thornton,et al.  ELIXIR: a distributed infrastructure for European biological data. , 2012, Trends in biotechnology.

[23]  Benjamin J. Raphael,et al.  The Mutational Landscape of Lethal Castrate Resistant Prostate Cancer , 2012, Nature.

[24]  S. Hanash,et al.  The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome , 2012, Nature Biotechnology.

[25]  A. Mardinoğlu,et al.  Systems medicine and metabolic modelling , 2012, Journal of internal medicine.

[26]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[27]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[28]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[29]  Carsten O. Daub,et al.  Update of the FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation , 2010, Nucleic Acids Res..

[30]  M. Mann,et al.  Defining the transcriptome and proteome in three functionally different human cell lines , 2010, Molecular systems biology.

[31]  E. Lundberg,et al.  Towards a knowledge-based Human Protein Atlas , 2010, Nature Biotechnology.

[32]  W. Nickel Pathways of unconventional protein secretion. , 2010, Current opinion in biotechnology.

[33]  Paul J. Choi,et al.  Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells , 2010, Science.

[34]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[35]  Kalle Jonasson,et al.  Prediction of the human membrane proteome , 2010, Proteomics.

[36]  Eric T. Wang,et al.  An Abundance of Ubiquitously Expressed Genes Revealed by Tissue Transcriptome Sequence Data , 2009, PLoS Comput. Biol..

[37]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[38]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[39]  Marcin J. Skwark,et al.  SPOCTOPUS: a combined predictor of signal peptides and membrane protein topology , 2008, Bioinform..

[40]  R. Aebersold,et al.  Selected reaction monitoring for quantitative proteomics: a tutorial , 2008, Molecular systems biology.

[41]  G. von Heijne,et al.  Prediction of membrane-protein topology from first principles , 2008, Proceedings of the National Academy of Sciences.

[42]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[43]  Erik L. L. Sonnhammer,et al.  Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server , 2007, Nucleic Acids Res..

[44]  David T. Jones,et al.  Improving the accuracy of transmembrane protein topology prediction using evolutionary information , 2007, Bioinform..

[45]  Matthias Mann,et al.  Innovations: Functional and quantitative proteomics using SILAC , 2006, Nature Reviews Molecular Cell Biology.

[46]  Karel Drbal,et al.  CD molecules 2005: human cell differentiation molecules. , 2005, Blood.

[47]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[48]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[49]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[50]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[51]  Eoin Fahy,et al.  MitoProteome: mitochondrial protein sequence database and annotation system , 2004, Nucleic Acids Res..

[52]  Yaoqi Zhou,et al.  Predicting the topology of transmembrane helical proteins using mean burial propensity and a hidden-Markov-model-based method , 2003 .

[53]  V Vécsei,et al.  Dedifferentiation-associated changes in morphology and gene expression in primary human articular chondrocytes in cell culture. , 2002, Osteoarthritis and cartilage.