Exploratory analysis of high-throughput metabolomic data

In order to make sense of the sheer volume of metabolomic data that can be generated using current technology, robust data analysis tools are essential. We propose the use of the growing self-organizing map (GSOM) algorithm and by doing so demonstrate that a deeper analysis of metabolomics data is possible in comparison to the widely used batch-learning self-organizing map, hierarchical cluster analysis and partitioning around medoids algorithms on simulated and real-world time-course metabolomic datasets. We then applied GSOM to a recently published dataset representing metabolome response patterns of three wheat cultivars subject to a field simulated cyclic drought stress. This novel and information rich analysis provided by the proposed GSOM framework can be easily extended to other high-throughput metabolomics studies.

[1]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[2]  Saman K. Halgamuge,et al.  Self-evolving neural networks for rule-based data processing , 1997, IEEE Trans. Signal Process..

[3]  Hideo Matsuda,et al.  Gene Classification Method Based on Batch-Learning SOM , 1999 .

[4]  Bala Srinivasan,et al.  Dynamic self-organizing maps with controlled growth for knowledge discovery , 2000, IEEE Trans. Neural Networks Learn. Syst..

[5]  S. Kanaya,et al.  Analysis of codon usage diversity of bacterial genes with a self-organizing map (SOM): characterization of horizontally transferred genes with emphasis on the E. coli O157 genome. , 2001, Gene.

[6]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[7]  A Modified Dynamic Self-Organizing Map Algorithm for Efficient Hardware Implementation , 2002, FSKD.

[8]  S. Kanaya,et al.  A novel bioinformatic strategy for unveiling hidden genome signatures of eukaryotes: self-organizing map of oligonucleotide frequency. , 2002, Genome informatics. International Conference on Genome Informatics.

[9]  Saman K. Halgamuge,et al.  An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data , 2003, Bioinform..

[10]  Saman K. Halgamuge,et al.  Enhancement of topology preservation and hierarchical dynamic self-organising maps for data visualisation , 2003, Int. J. Approx. Reason..

[11]  O. Fiehn Metabolomics – the link between genotypes and phenotypes , 2004, Plant Molecular Biology.

[12]  M. Hirai,et al.  Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  T. Abe,et al.  Direct cloning of genes encoding novel xylanases from the human gut. , 2005, Canadian journal of microbiology.

[14]  T. Abe,et al.  Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes , 2005, Nature Biotechnology.

[15]  M. Hirai,et al.  Elucidation of Gene-to-Gene and Metabolite-to-Gene Networks in Arabidopsis by Integration of Metabolomics and Transcriptomics* , 2005, Journal of Biological Chemistry.

[16]  A. Hsu,et al.  Scalable Dynamic Self-Organising Maps for Mining Massive Textual Data , 2006, ICONIP.

[17]  S. Kanaya,et al.  A novel bioinformatics tool for phylogenetic classification of genomic sequence fragments derived from mixed genomes of uncultured environmental microbes , 2006 .

[18]  Kazuki Saito,et al.  Integrated Data Mining of Transcriptome and Metabolome Based on BL-SOM , 2006 .

[19]  S. Kanaya,et al.  Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes. , 2006, Gene.

[20]  美弦 矢野,et al.  <ファクトデータベース・フリーウェア特集号> 一括学習型自己組織化マップ(BL-SOM)を利用したメタボロームおよびトランスクリプトームデータの統合解析 , 2006 .

[21]  Saman K. Halgamuge,et al.  Binning sequences using very sparse labels within a metagenome , 2008, BMC Bioinformatics.

[22]  J. K. Kim,et al.  Time-course metabolic profiling in Arabidopsis thaliana cell cultures after salt stress treatment. , 2007, Journal of experimental botany.

[23]  S. Kanaya,et al.  Metabolic profiling using Fourier-transform ion-cyclotron-resonance mass spectrometry , 2007, Analytical and bioanalytical chemistry.

[24]  A. Hsu,et al.  Using Growing Self-Organising Maps to Improve the Binning Process in Environmental Whole-Genome Shotgun Sequencing , 2007, Journal of biomedicine & biotechnology.

[25]  A. Izanloo,et al.  Different mechanisms of adaptation to cyclic water stress in two South Australian bread wheat cultivars , 2008, Journal of experimental botany.

[26]  D. Wishart Metabolomics: applications to food science and nutrition research , 2008 .

[27]  Nancy Argüelles,et al.  Author ' s , 2008 .

[28]  J. K. Kim,et al.  Analysis of metabolite profile data using batch-learning self-organizing maps , 2007, Journal of Plant Biology.

[29]  M. Hirai,et al.  Widely Targeted Metabolomics Based on Large-Scale MS/MS Data for Elucidating Metabolite Accumulation Patterns in Plants , 2008, Plant & cell physiology.

[30]  Toshimichi Ikemura,et al.  Batch-Learning Self-Organizing Map for Predicting Functions of Poorly-Characterized Proteins Massively Accumulated , 2009, WSOM.

[31]  Saman K. Halgamuge,et al.  A new generalized growth threshold for dynamic SOM for comparing average mutual information and oligonucleotide frequency as a species signature , 2009, BSBT 2009.

[32]  R. Amann,et al.  Practical application of self-organizing maps to interrelate biodiversity and functional data in NGS-based metagenomics , 2011, The ISME Journal.

[33]  Jairus Bowne,et al.  Drought responses of leaf tissues from wheat cultivars of differing drought tolerance at the metabolite level. , 2012, Molecular plant.