Languages and Compilers for Parallel Computing
暂无分享,去创建一个
[1] Vivek Sarkar. Loop Transformations for Hierarchical Parallelism and Locality , 1998, LCR.
[2] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[3] Geoffrey C. Fox,et al. Interpreting the performance of HPF/Fortran 90D , 1994, Proceedings of Supercomputing '94.
[4] CONSTANTINE D. POLYCHRONOPOULOS,et al. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.
[5] John Randal Allen,et al. Dependence analysis for subscripted variables and its application to program transformations , 1983 .
[6] Vivek Sarkar,et al. Optimization of array accesses by collective loop transformations , 1991, ICS '91.
[7] Chau-Wen Tseng. An optimizing Fortran D compiler for MIMD distributed-memory machines , 1993 .
[8] Vivek Sarkar,et al. Automatic selection of high-order transformations in the IBM XL FORTRAN compilers , 1997, IBM J. Res. Dev..
[9] John A. Chandy,et al. Communication Optimizations Used in the Paradigm Compiler for Distributed-Memory Multicomputers , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[10] Vivek Sarkar,et al. Automatic parallelization for symmetric shared-memory multiprocessors , 1996, CASCON.
[11] PeiZong Lee,et al. Compiling Efficient Programs for Tightly-Coupled Distributed Memory Computers , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[12] Vivek Sarkar,et al. Optimal weighted loop fusion for parallel programs , 1997, SPAA '97.
[13] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[14] Prithviraj Banerjee,et al. Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers , 1995, LCPC.
[15] William Pugh,et al. Minimizing communication while preserving parallelism , 1996, ICS '96.
[16] Santosh G. Abraham,et al. Compiler techniques for data partitioning of sequentially iterated parallel loops , 1990, ICS '90.
[17] Geoffrey C. Fox,et al. Java as a Language for Scientific Parallel Programming , 1997, LCPC.
[18] Jingke Li,et al. Index domain alignment: minimizing cost of cross-referencing between distributed arrays , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.
[19] Vivek Sarkar,et al. A general framework for iteration-reordering loop transformations , 1992, PLDI '92.
[20] Michael Metcalf,et al. Fortran 90 Explained , 1990 .
[21] J. Ramanujam,et al. A methodology for parallelizing programs for multicomputers and complex memory multiprocessors , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[22] Ken Kennedy,et al. Automatic data layout for distributed-memory machines , 1998, TOPL.
[23] Prithviraj Banerjee,et al. Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers , 1996 .
[24] Vivek Sarkar,et al. On Estimating and Enhancing Cache Effectiveness , 1991, LCPC.
[25] Guang R. Gao,et al. Automatic Data and Computation Decomposition for Distributed-Memory Machines , 1995, Parallel Process. Lett..
[26] Alan Jay Smith,et al. Performance Characterization of Optimizing Compilers , 1992, IEEE Trans. Software Eng..
[27] Ko-Yang Wang. Precise compile-time performance prediction for superscalar-based computers , 1994, PLDI '94.
[28] Rafael Hector Saavedra-Barrera,et al. CPU performance evaluation and execution time prediction using narrow spectrum benchmarking , 1992 .
[29] John Paul Shen,et al. Theoretical modeling of superscalar processor performance , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.