Energy efficiency has emerged as one of the key performance metrics. In this work, we first implement a baseline architecture for matrix multiplication, parameterized with the number of processing elements and the types of storage memory. We map this architecture onto a state-of-the-art Field Programmable Gate Array (FPGA). A design space is generated to demonstrate the effect of these parameters on the energy efficiency (defined as number of operations per Joule). We determine that on-chip memory constitutes the largest amount of power consumption among all the components. To improve energy performance, we propose a memory activation schedule. Using this scheme, the proposed optimized design achieves 2.2x and 1.33x improvement with respect to Energy×Area×Time (EAT) and energy efficiency, respectively, compared with the state-of-the-art matrix multiplication core.
[1]
Viktor K. Prasanna,et al.
Energy- and time-efficient matrix multiplication on FPGAs
,
2005,
IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[2]
Yong Dou,et al.
64-bit floating-point FPGA matrix multiplication
,
2005,
FPGA '05.
[3]
Viktor K. Prasanna,et al.
Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems
,
2007,
IEEE Transactions on Parallel and Distributed Systems.
[4]
K PrasannaViktor,et al.
Energy- and time-efficient matrix multiplication on FPGAs
,
2005
.