Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
暂无分享,去创建一个
[1] James Demmel,et al. The PHiPAC v1.0 Matrix-Multiply Distribution , 1998 .
[2] James Demmel,et al. Using PHiPAC to speed error back-propagation learning , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[3] Jacqueline Chame,et al. The combined effectiveness of unimodular transformations, tiling, and software prefetching , 1996, Proceedings of International Conference on Parallel Processing.
[4] Brian Kingsbury,et al. Spert-II: A Vector Microprocessor System , 1996, Computer.
[5] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[6] Bowen Alpern,et al. Space-limited procedures: a methodology for portable high-performance , 1995, Programming Models for Massively Parallel Computers.
[7] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[8] Larry Carter,et al. Hierarchical tiling for improved superscalar performance , 1995, Proceedings of 9th International Parallel Processing Symposium.
[9] John McCalpin,et al. Automatic benchmark generation for cache optimization of matrix operations , 1995, ACM-SE 33.
[10] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[11] Chandrika Kamath,et al. DXML: A High-performance Scientific Subroutine Library , 1994, Digit. Tech. J..
[12] Bo Kågström,et al. Portable High Performance GEMM-Based Level 3 BLAS , 1993, PPSC.
[13] Ed Anderson,et al. LAPACK users' guide - [release 1.0] , 1992 .
[14] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[15] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[16] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[17] Jack J. Dongarra,et al. An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.
[18] G. Golub. Matrix computations , 1983 .
[19] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.