High Performance Computing - HiPC 2003

Is there a unifying principle in biology that can be deciphered from the structures of all the present day genomes? Can one hope to determine the general structure and organization of cellular information as well as elucidate the underlying evolutionary dynamics, by systematically exploring various statistical characteristics of the genomic and proteomic data at many levels? A large-scale software and hardware system, Valis, developed by NYU bioinformatics group, aims to do just that. We analyze word-frequency and domain family size distributions across various genomes, and the distribution of the potential ”hot-spots” for segmental duplications in the human genome. We hypothesize and test, by computational analysis, that such a pattern is the result of a generic dominating evolution mechanism, ”evolution by duplication”, originally suggested by Susumu Ohno. We examine what implications these duplications may have in determining the translocations of male-biased genes from sex chromosomes, genome structure of sex chromosomes, copy-number fluctuations (amplifications of oncogenes and homozygous or homozygous deletions of tumor suppressor genes) in cancer genomes, etc. We examine, through our explorations with Valis, how important a role information technology is likely to play in elucidating biology at all levels. T.M. Pinkston and V.K. Prasanna (Eds.): HiPC 2003, LNCS 2913, p. 1, 2003. c © Springer-Verlag Berlin Heidelberg 2003 T.M. Pinkston and V.K. Prasanna (Eds.): HiPC 2003, LNCS 2913, pp. 2–11, 2003. © Springer-Verlag Berlin Heidelberg 2003 Performance Analysis of Blue Gene/L Using Parallel Discrete Event Simulation Ed Upchurch, Paul L. Springer, Maciej Brodowicz, Sharon Brunett, and T.D. Gottschalk Center for Advanced Computing Research, California Institute of Technology 1200 E. California Boulevard, Pasadena, CA 91125, etu@cacr.caltech.edu Abstract. High performance computers currently under construction, such as IBM’s Blue Gene/L, consisting of large numbers (64K) of low cost processing elements with relatively small local memories (256MB) connected via relatively low bandwidth (0.375 Bytes/FLOP) low cost interconnection networks promise exceptional cost-performance for some scientific applications. Due to the large number of processing elements and adaptive routing networks in such systems, performance analysis of meaningful application kernels requires innovative methods. This paper describes a method that combines application analysis, tracing and parallel discrete event simulation to provide early performance prediction. Specifically, results of performance analysis of a Lennard-Jones Spatial (LJS) Decomposition molecular dynamics benchmark code for Blue Gene/L are given. High performance computers currently under construction, such as IBM’s Blue Gene/L, consisting of large numbers (64K) of low cost processing elements with relatively small local memories (256MB) connected via relatively low bandwidth (0.375 Bytes/FLOP) low cost interconnection networks promise exceptional cost-performance for some scientific applications. Due to the large number of processing elements and adaptive routing networks in such systems, performance analysis of meaningful application kernels requires innovative methods. This paper describes a method that combines application analysis, tracing and parallel discrete event simulation to provide early performance prediction. Specifically, results of performance analysis of a Lennard-Jones Spatial (LJS) Decomposition molecular dynamics benchmark code for Blue Gene/L are given.