论文信息 - High-Performance and Parameterized Matrix Factorization on FPGAs

High-Performance and Parameterized Matrix Factorization on FPGAs

FPGAs have become an attractive choice for scientific computing. In this paper, we propose a high performance design for LU decomposition, a key kernel in many scientific and engineering applications. Our design achieves the optimal performance for LU decomposition using the available hardware resources. The design is parameterized. Thus, it can be easily adapted to various hardware constraints. Experimental results show that our design achieves high performance and offers good scalability. Our implementation on a Xilinx Virtex-II Pro XC2VP100 achieves superior sustained floating-point performance over existing FPGA-based implementations and optimized libraries on the state-of-the-art processors.

Viktor K. Prasanna | Ling Zhuo

[1] Wang Chen,et al. An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm , 2004, FPGA '04.

[2] Sotirios G. Ziavras,et al. Performance optimization of an FPGA-based configurable multiprocessor for matrix operations , 2003, Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798).

[3] Viktor K. Prasanna,et al. High Performance Linear Algebra Operations on Reconfigurable Systems , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[4] Viktor K. Prasanna,et al. A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing , 2005, ERSA.

[5] James Demmel,et al. LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.

[6] Viktor K. Prasanna,et al. Time and Energy Efficient Matrix Factorization Using FPGAs , 2003, FPL.

[7] Thomas H. Cormen,et al. Introduction to algorithms [2nd ed.] , 2001 .

[8] Viktor K. Prasanna,et al. Efficient Floating-point Based Block LU Decomposition on FPGAs , 2004, ERSA.