Optimal circuits for parallel bit reversal

In this paper, we develop novel parallel circuit designs for calculating the bit reversal. To perform bit reversal on 2n data words, the designs take 2k (k < n) words as input each cycle. The circuits consist of concatenated single-port buffers and 2-to-1 multiplexers and use minimum number of registers for control. The designs consume minimum number of single-port memory banks that are necessary for calculating continuous-flow bit reversal, as well as near optimal 2n memory words. The proposed parallel circuits can be built for any given fixed k and n , and achieve superior performance over state-of-the-art for calculating the bit reversal in parallel multi-path FFT architectures.

[1]  V. Benes Optimal rearrangeable multistage connecting networks , 1964 .

[2]  E. V. Jones,et al.  A pipelined FFT processor for word-sequential data , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  Charles M. Rader,et al.  Digital processing of signals , 1983 .

[4]  Keshab K. Parhi,et al.  Pipelined Parallel FFT Architectures via Folding Transformation , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Jesús Grajal,et al.  Pipelined Radix-$2^{k}$ Feedforward FFT Architectures , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Mats Torkelson,et al.  A new approach to pipeline FFT processor , 1996, Proceedings of International Conference on Parallel Processing.

[7]  T. Moon,et al.  Mathematical Methods and Algorithms for Signal Processing , 1999 .

[8]  Juan Manuel Rius,et al.  New FFT bit-reversal algorithm , 1995, IEEE Trans. Signal Process..

[9]  Yoshikazu Miyanaga,et al.  An area and power efficient pipeline FFT processor for 8×8 MIMO-OFDM systems , 2011, 2011 IEEE International Symposium of Circuits and Systems (ISCAS).

[10]  Jesús Grajal,et al.  Optimum Circuits for Bit Reversal , 2011, IEEE Transactions on Circuits and Systems II: Express Briefs.

[11]  Feng Yu,et al.  An Optimum Architecture for Continuous-Flow Parallel Bit Reversal , 2015, IEEE Signal Processing Letters.

[12]  Viktor K. Prasanna,et al.  Optimizing interconnection complexity for realizing fixed permutation in data and signal processing algorithms , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[13]  M.N.S. Swamy,et al.  A fast FFT bit-reversal algorithm , 1994 .

[14]  Viktor K. Prasanna,et al.  Energy and Memory Efficient Mapping of Bitonic Sorting on FPGA , 2015, FPGA.

[15]  Shyh-Jye Jou,et al.  Continuous-flow Parallel Bit-Reversal Circuit for MDF and MDC FFT Architectures , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.

[16]  Jarmo Takala,et al.  Stride permutation networks for array processors , 2004, Proceedings. 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors, 2004..

[17]  Viktor K. Prasanna,et al.  Automatic generation of high throughput energy efficient streaming architectures for arbitrary fixed permutations , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[18]  Viktor K. Prasanna,et al.  Energy-efficient architecture for stride permutation on streaming data , 2013, 2013 International Conference on Reconfigurable Computing and FPGAs (ReConFig).

[19]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[20]  James C. Hoe,et al.  Permuting streaming data using RAMs , 2009, JACM.