Energy-Efficient Computations on FPGAs

Recently, energy dissipation for computations on FPGAs has become an important performance metric. In this paper, we summarize our recent efforts in developing an algorithm-level design methodology for optimizing the energy performance of FPGA based implementations. For kernels, our design methodology consists of four steps: domain selection, domain-specific energy modeling, domain-space exploration and low-level simulation. To achieve system-level energy-efficiency, we outline a design methodology that integrates the kernel-level design methodology. Both the design methodologies can be used to achieve not only energy-efficiency but also latency, area, and power efficiency. We consider signal processing kernels as illustrative examples and demonstrate energy and time efficient algorithms and implementations for these on FPGAs. Example energy performance optimization through algorithmic optimizations include the 29–51% improvement in energy performance for a matrix multiplication kernel, 57–78% improvement for a FFT kernel and the 10–60% improvement for a floating-point LU decomposition kernel over state-of-the-art implementations. Similarly, an improvement of 41 to 46% in energy performance was achieved by the system-level design approach over a greedy approach for a MVDR adaptive beamforming application. Finally we briefly describe a high-level tool for obtaining parameterized and energy-efficient designs on FPGAs.

[1]  Viktor K. Prasanna,et al.  PyGen: a MATLAB/Simulink based tool for synthesizing parameterized and energy efficient designs using FPGAs , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[2]  Li Shang,et al.  Dynamic power consumption in Virtex™-II FPGA family , 2002, FPGA '02.

[3]  Viktor K. Prasanna,et al.  Rapid design space exploration of heterogeneous embedded systems using symbolic search and multi-granular simulation , 2002, LCTES/SCOPES '02.

[4]  Viktor K. Prasanna,et al.  Energy- and time-efficient matrix multiplication on FPGAs , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Simon Haykin,et al.  Adaptive filter theory (2nd ed.) , 1991 .

[6]  V. Prasanna,et al.  Energy-Efficient Design of Kernel Applications for FPGAs Through Domain-Specific Modeling , 2002 .

[7]  Viktor K. Prasanna,et al.  Scalable and modular algorithms for floating-point matrix multiplication on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[8]  Viktor K. Prasanna,et al.  Energy-efficient and parameterized designs for fast Fourier transform on FPGAs , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  Viktor K. Prasanna,et al.  Energy-efficient hardware/software co-synthesis for a class of applications on reconfigurable SoCs , 2005, Int. J. Embed. Syst..

[10]  Viktor K. Prasanna,et al.  On Synthesizing Optimal Family of Linear Systolic Arrays for Matrix Multiplication , 1991, IEEE Trans. Computers.

[11]  Viktor K. Prasanna,et al.  Energy-Efficient Matrix Multiplication on FPGAs , 2002, FPL.

[12]  Viktor K. Prasanna,et al.  A high-performance and energy-efficient architecture for floating-point based LU decomposition on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[13]  Abbes Amira,et al.  Accelerating Matrix Product on Reconfigurable Hardware for Signal Processing , 2001, FPL.

[14]  Viktor K. Prasanna,et al.  Domain-Specific Modeling for Rapid Energy Estimation of Reconfigurable Architectures , 2004, The Journal of Supercomputing.

[15]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[16]  Sujit Dey,et al.  High-Level Power Analysis and Optimization , 1997 .

[17]  Gary K. Yeap,et al.  Practical Low Power Digital VLSI Design , 1997 .

[18]  Li Shang,et al.  High-level power modeling of CPLDs and FPGAs , 2001, Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001.

[19]  Viktor K. Prasanna,et al.  Time and Energy Efficient Matrix Factorization Using FPGAs , 2003, FPL.

[20]  Emmanuel Casseau,et al.  A linear systolic array for LU decomposition , 1994, Proceedings of 7th International Conference on VLSI Design.