论文信息 - A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer

A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer

Supercomputer companies such as Cray, Silicon Graphics, and SRC Computers now offer reconfigurable computer (RC) systems that combine general-purpose processors (GPPs) with field-programmable gate arrays (FPGAs). The FPGAs can be programmed to become, in effect, application-specific processors. These exciting supercomputers allow end-users to create custom computing architectures aimed at the computationally intensive parts of each problem. This report describes a parameterized, parallelized, deeply pipelined, dual-FPGA, IEEE-754 64-bit floating-point design for accelerating the conjugate gradient (CG) iterative method on an FPGA-augmented RC. The FPGA-based elements are developed via a hybrid approach that uses a high-level language (HLL)-to-hardware description language (HDL) compiler in conjunction with custom-built, VHDL-based, floating-point components. A reference version of the design is implemented on a contemporary RC. Actual run time performance data compare the FPGA-augmented CG to the software-only version and show that the FPGA-based version runs 1.3 times faster than the software version. Estimates show that the design can achieve a 4 fold speedup on a next-generation RC

Viktor K. Prasanna | Gerald R. Morris | Richard D. Anderson

[1] Richard Barrett,et al. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.

[2] Reinhard Männer,et al. Using floating-point arithmetic on FPGAs to accelerate scientific N-Body simulations , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[3] Viktor K. Prasanna,et al. High-performance FPGA-based general reduction methods , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).

[4] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .

[5] Gerald Estrin,et al. Organization of computer systems: the fixed plus variable structure computer , 1960, IRE-AIEE-ACM '60 (Western).

[6] Henk A. van der Vorst. Krylov subspace iteration , 2000, Comput. Sci. Eng..

[7] André DeHon,et al. Floating-point sparse matrix-vector multiply for FPGAs , 2005, FPGA '05.

[8] Gerald Estrin,et al. Reconfigurable Computer Origins: The UCLA Fixed-Plus-Variable (F+V) Structure Computer , 2002, IEEE Ann. Hist. Comput..

[9] Viktor K. Prasanna,et al. Sparse Matrix-Vector multiplication on FPGAs , 2005, FPGA '05.

[10] Youcef Saad,et al. SPARSKIT: a basic tool kit for sparse matrix computations - Version 2 , 1990 .

[11] Viktor K. Prasanna,et al. Computing Lennard-Jones Potentials and Forces with Reconfigurable Hardware , 2004, ERSA.

[12] Keith D. Underwood,et al. FPGAs vs. CPUs: trends in peak floating-point performance , 2004, FPGA '04.

[13] Maya Gokhale,et al. A Preliminary Study of Molecular Dynamics on Reconfigurable Computers , 2003, Engineering of Reconfigurable Systems and Algorithms.

[14] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[15] Viktor K. Prasanna,et al. A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing , 2005, ERSA.

[16] Viktor K. Prasanna,et al. An FPGA-based floating-point Jacobi iterative solver , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).

[17] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .

[18] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.