High Performance Software on Intel Pentium Pro Processors or Micro-Ops to TeraFLOPS
暂无分享,去创建一个
[1] Shmuel Winograd,et al. A New Algorithm for Inner Product , 1968, IEEE Transactions on Computers.
[2] V. Strassen. Gaussian elimination is not optimal , 1969 .
[3] Jack J. Dongarra. Performance of various computers using standard linear equations software in a Fortran environment , 1983, CARN.
[4] Ed Anderson,et al. LAPACK Users' Guide , 1995 .
[5] Robert A. van de Geijn. Massively Parallel Linpack Benchmark on the Intel Touchstone Delta andIPSC/860 Systems (Progress Report) , 1991 .
[6] J. Dongarra. Performance of various computers using standard linear equations software , 1990, CARN.
[7] Bruce Hendrickson,et al. The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers , 1994, SIAM J. Sci. Comput..
[8] Greg Henry,et al. Massively Parallel Distributed Computing: World's First 281 Gigaflop Supercomputer , 1995 .
[9] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[10] David A. Patterson,et al. Computer organization and design (2nd ed.): the hardware/software interface , 1997 .
[11] David J. Lilja,et al. When Caches Aren't Enough: Data Prefetching Techniques , 1997, Computer.
[12] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.