This post has deliberately come after a long gap as there are other blogs like the recently launched Programmable Planet which is far superior in content and substance.
Fast Fourier Transforms are almost ubiquitous for anyone dealing with signal processing & communications systems. Time domain analysis doesn’t provide us much required information as signal processing mostly relies on frequency domain techniques like modulation, up/downconversion, filtering. In short FFT is required in almost every design for either on-line or off-line analysis. For example, Peak search/scan is generally performed in spectral domain.
Xilinx Fast Fourier Transform IP Core provides 4 architectures. There is obviously a trade-off between speed (performance) & area.
I’ve considered an example of 64k (65536) transform length clocked at 100 MSPS & target data rate of 100 MSPS. Table shows the theoretical latency & resource estimates provided by Xilinx IP core.

From the table, the trade-off vis-a-vis FPGA Architectures is clear.
FPGA’s contain Block RAM (BRAM) & Distributed RAMs (DRAM). BRAM’s are dedicated memory blocks. Each FPGA has them. FPGA datasheets typically specify the Total BRAM in Kbits.
DRAM’s are RAM’s that can be constructed using Look-Up-Tables (LUT).
SRAM based FPGA’s have LUTs. These LUTs can be used as a small block of RAM by combining cells; called DRAMs.
This kind of RAM is called DISTRIBUTED as LUTs are or spread out across the FPGA fabric.
- This post is inspired from Clive “Max” Maxfield book on FPGA’s
