论文信息 - An FPGA-Based Application-Specific Processor for Efficient Reduction of Multiple Variable-Length Floating-Point Data Sets

An FPGA-Based Application-Specific Processor for Efficient Reduction of Multiple Variable-Length Floating-Point Data Sets

Reconfigurable computers (RCs) that combine generalpurpose processors with field-programmable gate arrays (FPGAs) are now available. In these exciting systems, the FPGAs become reconfigurable application-specific processors (ASPs). Specialized high-level language (HLL) to hardware description language (HDL) compilers allow these ASPs to be reconfigured using HLLs. In our research we describe a novel toroidal data structure and scheduling algorithm that allows us to use an HLL-to-HDL environment to implement a high-performance ASP that reduces multiple, variable-length sets of 64-bit floating-point data. We demonstrate the effectiveness of our ASP by using it to accelerate a sparse matrix iterative solver. We compare actual wall clock run times of a production-quality software iterative solver with an ASP-augmented version of the same solver on a current generation RC. Our ASP-augmented solver runs up to 2.4 times faster than software. Estimates show that this same design can run over 6.4 times faster on a next-generation RC.

Viktor K. Prasanna | Gerald R. Morris | Richard D. Anderson

[1] Gerald Estrin,et al. Organization of computer systems: the fixed plus variable structure computer , 1960, IRE-AIEE-ACM '60 (Western).

[2] Youcef Saad,et al. SPARSKIT: a basic tool kit for sparse matrix computations - Version 2 , 1990 .

[3] Yvon Savaria,et al. A flexible floating-point format for optimizing data-paths and operators in FPGA based DSPs , 2002, FPGA '02.

[4] Viktor K. Prasanna,et al. High-performance and area-efficient reduction circuits on FPGAs , 2005, 17th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'05).

[5] Viktor K. Prasanna,et al. An FPGA-based floating-point Jacobi iterative solver , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).

[6] Wayne Luk,et al. Automating custom-precision function evaluation for embedded processors , 2005, CASES '05.

[7] Brent E. Nelson,et al. Higher radix floating-point representations for FPGA-based arithmetic , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).

[8] Viktor K. Prasanna,et al. Designing scalable FPGA-based reduction circuits using pipelined floating-point cores , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[9] Viktor K. Prasanna,et al. A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[10] Jeffrey Hammes,et al. A Transformational Approach to High Performance Embedded Computing , 2004 .

[11] Viktor K. Prasanna,et al. High Performance Linear Algebra Operations on Reconfigurable Systems , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12] Liang-Gee Chen,et al. JPEG, MPEG-4, and H.264 codec IP development , 2005, Design, Automation and Test in Europe.

[13] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.

[14] André DeHon,et al. Floating-point sparse matrix-vector multiply for FPGAs , 2005, FPGA '05.

[15] Viktor K. Prasanna,et al. A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing , 2005, ERSA.