Radix-2 multi-dimensional transposition-free FFT algorithm for Modern Single Instruction Multiple Data (SIMD) architectures

A general radix-2 FFT algorithm was recently developed and implemented for Modern Single Instruction Multiple Data (SIMD) architectures. This algorithm (SIMD-FFT) was found to be faster than any scalar FFT implementation, and as well, than other FFT implementations that uses the SIMD architecture for complex 1D and 2D input data [1]. In this paper, the SIMD-FFT algorithm is extended to handle Multi-Dimensional input data; this new approach does not make use of matrix transposition. The results are compared against the FFTW for the 2D and 3D case. Overall, the SIMD-FFT was found to be faster for complex 2D input data (ranging from 82% up to 343%), and as well, for complex 3D input data (ranging from 59.5% up to 198%).

[1]  Matteo Frigo,et al.  A fast Fourier transform compiler , 1999, PLDI '99.

[2]  C. Loan Computational Frameworks for the Fast Fourier Transform , 1992 .

[3]  Intel Corportation,et al.  IA-32 Intel Architecture Software Developers Manual , 2004 .

[4]  Paul Rodriguez V. A radix-2 FFT algorithm for Modern Single Instruction Multiple Data (SIMD) architectures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Franz Franchetti,et al.  Architecture independent short vector FFTs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[6]  A.K. Krishnamurthy Multidimensional digital signal processing , 1985, Proceedings of the IEEE.