Fourier Transform
Write any signal as a sum of sinusoids — a bridge from signal to compute.
Prerequisites
What you'll learn here
- Explain the idea of decomposing a signal into frequency components
- Write the continuous and discrete (DFT) Fourier transforms
- State the FFT's O(N log N) complexity and why it matters
- Recall the convolution theorem (multiplication in time = MVM in frequency)
- Show why SIDRA crossbars often skip FFT by doing convolution directly
Hook: Notes in Music = Secret of Signals
Press C, E, G on a piano simultaneously — you hear a chord. The chord is a complex wave, but your brain separates it into three notes. That’s the natural Fourier transform your ear performs.
Joseph Fourier showed in 1822: every periodic signal can be written as a sum of sines and cosines. That’s the foundation of modern signal processing (radio, audio, image, MRI, JPEG, MP3, WiFi, 5G — all of it).
Why is Fourier important for AI?
- Convolution theorem: convolution in the time domain = multiplication in the frequency domain. Foundation of CNNs.
- FFT (Fast Fourier Transform): algorithm, taking down to . One of the cornerstones of modern computing.
- Spectral methods: processing signals in frequency space is often more efficient (filtering, compression, denoising).
Interesting question for SIDRA: can the crossbar do Fourier? Yes — the DFT is an MVM (matrix × vector). But more interesting: SIDRA can do convolution directly in the crossbar → it doesn’t need FFT.
This chapter builds Fourier from intuition to math, reveals the FFT trick, and shows why SIDRA “skips” Fourier.
Intuition: Signal = Sum of Frequencies
A square wave looks simple, but it has infinitely many frequency components:
Only odd harmonics, with falling amplitude. Add enough and a square wave emerges (Gibbs phenomenon leaves a small ripple, but the approximation is good).
General principle: every “well-behaved” signal is a sum of sinusoids. Decomposing it = the Fourier transform.
A signal’s “spectrum”:
| Signal | Spectrum |
|---|---|
| Single sinusoid | One peak (at that frequency) |
| Square wave | Odd harmonics 1f, 3f, 5f, … |
| Impulse | Flat spectrum (equal at every frequency) |
| White noise | Flat spectrum (random) |
| Beethoven’s 5th | Cluster of complex peaks |
Practical use:
- Filtering: remove unwanted frequencies (bass/treble, denoising).
- Compression: JPEG/MP3 drop the unimportant frequencies (compression).
- Function analysis: which frequencies dominate? FFT shows.
- Convolution: filtering is a convolution; in the frequency domain it becomes multiplication → fast.
Formalism: Continuous and Discrete Fourier
Continuous Fourier transform:
For a continuous signal :
Inverse:
is complex: amplitude = how strong, phase = when.
Discrete Fourier (DFT):
For arrays ( samples):
This is an MVM — matrix entries . Complexity: .
FFT (Fast Fourier Transform):
Cooley-Tukey 1965 algorithm: even/odd split + recursion. Complexity .
| DFT | FFT | |
|---|---|---|
| 256 | 65,536 | 2,048 |
| 1024 | 1M | 10,240 |
| 1M | 10¹² | 20M |
Without the FFT, modern digital signal processing would be impossible. Maybe the most important algorithm in modern computing.
Convolution theorem:
Two signals and . Convolution:
Fourier transform:
Convolution in time = multiplication in frequency. Very powerful:
- Time-domain convolution complexity: .
- Go to frequency + multiply + come back (FFT): .
Practical effect: long convolutions (radar, audio, image processing) are accelerated with FFT.
CNN tie-in:
A CNN’s conv layer = convolution with a small (filter, 3×3, 5×5). Small filter → direct convolution is fast; FFT usually unnecessary.
But with large filters (e.g., global attention) the FFT kernel can be used.
Convolution on SIDRA:
The SIDRA crossbar can do convolution directly. Sliding-window MVM:
- Filter weights programmed into the crossbar.
- For each position, the input window enters, MVM runs → output pixel.
- N×N filter, M×M image → M² × N² MVMs.
Crossbar ~10 ns/MVM → full image convolution at microsecond scale.
Need the FFT? For SIDRA, usually no — small filters direct, large filters could use FFT (still MVM, still maps to crossbar).
FFT on the crossbar:
The DFT matrix is fixed. Programmable into the crossbar → DFT becomes an MVM.
But: complex numbers. Memristors are real-valued conductances. Fix: 2× size → store real and imaginary parts separately. An DFT matrix → a real crossbar.
256-DFT: 256×256 → 512×512 real = 4 SIDRA crossbars. Possible, but the FFT’s advantage disappears (the crossbar is already parallel, ).
Practical decision: in SIDRA, do compute-heavy convolution directly in the crossbar; FFT not needed. FFT is more sensible on a digital CPU/DSP (specialized hardware exists).
Other Fourier-like transforms:
- DCT (Discrete Cosine Transform): the basis of JPEG, compression.
- Hadamard transform: ±1 entries → no multiplication, only ±. Fast approximation.
- Wavelet: time + frequency together. Wavelet-based compression (JPEG-2000).
All MVM-based → doable on SIDRA, but specialized digital hardware is usually more efficient.
Spectral methods in AI:
Recent years have brought spectral neural networks:
- Fourier Neural Operators (Li et al. 2020): partial-differential-equation solvers.
- Spectral attention (Lee-Thorp et al. 2022): FFT to accelerate attention.
- Wavelet transformer (Wang et al. 2024): hybrid time-frequency.
These models are interesting for SIDRA — partly hybrid (CMOS FFT + crossbar MVM) at the Y10 horizon.
Experiment: Compute a 4-Sample DFT by Hand
4-sample sequence: .
DFT:
Interpretation: the signal looks like (only and peaks). As expected: the sequence is literally a wave (1, 0, -1, 0).
On a SIDRA crossbar:
DFT matrix :
Split real and imaginary:
Two separate crossbars (4×4 each) compute the DFT. Result: real + i × imaginary.
But for digital FFT: a 4-DFT only needs complex multiplies, sub-microsecond. SIDRA is overkill at this size. For a 1024-DFT (common in signal processing), the crossbar becomes a real option.
Quick Quiz
Lab Exercise
Convolution vs FFT on SIDRA crossbar for a 256-sample signal.
Data:
- Signal: (e.g. audio).
- Filter: (e.g. low-pass).
- Convolution : .
Compute complexity:
(a) Direct-convolution FLOPs? (b) FFT-based: FFT(, 512) + FFT(, 512, zero-pad) + elementwise mul + inverse FFT(512). FLOPs? (c) Direct convolution on SIDRA: how many MVMs, how many ns? (d) FFT on SIDRA: 512-DFT crossbar size, MVMs, time? (e) Which is better for SIDRA?
Solutions
(a) Direct: 287 outputs × 32 multiplies + ~31 adds = 287 × 63 ≈ 18K FLOPs.
(b) FFT(512) = 512 × log 512 = 512 × 9 ≈ 4600 FLOPs/transform. 3 transforms (x, h forward + inverse) + 512 elementwise multiplies = ~14,800 FLOPs. Less than direct, but the log advantage is small (filter is small).
(c) Direct on SIDRA: 32-input filter. In a 256×256 crossbar, use 32 rows → 287 sliding windows. Each ~10 ns. Total: ~2900 ns ≈ 3 µs.
(d) FFT on SIDRA: 512×512 DFT matrix → 4 crossbars (256×256). Real + imaginary split → 8 crossbars. 3 FFT × 4 MVM = 12 MVMs × 10 ns = 120 ns. Plus elementwise: 5 ns. Total ~125 ns. Much faster.
(e) For large signals, FFT on SIDRA is good — especially for parallel batches. For small signals, direct convolution suffices. Hybrid: pick by signal size. Compiler decision (chapter 6.7).
Cheat Sheet
- Fourier: every signal is a sum of sinusoids.
- DFT: discrete transform, . Complexity .
- FFT: Cooley-Tukey 1965, . A cornerstone of modern computing.
- Convolution theorem: . Math basis of CNNs.
- SIDRA tie: DFT is an MVM → crossbar can do it, but direct convolution is often more efficient.
- Hybrid: FFT crossbar for large signals; direct for small.
Vision: Spectral AI and SIDRA's Place
Most AI today works in the time domain (Transformers included). Spectral AI is a new wave:
- Y1 (today): Convolution direct on SIDRA; spectral analysis on CMOS FFT.
- Y3 (2027): Spectral attention prototype (Lee-Thorp style). Language model inference 2-3× faster.
- Y10 (2029): Hybrid time-frequency processing. Image and audio models.
- Y100 (2031+): Fourier Neural Operator hardware. Weather, fluid dynamics, PDE-based scientific AI.
- Y1000 (long horizon): Quantum FFT (Shor’s basis) hybrids. Classical + quantum spectral AI.
Meaning for Türkiye: spectral AI is new — big rivals haven’t dominated yet. SIDRA + spectral processing = collaboration opening with research bodies like TÜBİTAK BİLGEM.
Unexpected future: a direct-physical-Fourier chip. Optical waveguides naturally separate a signal’s frequency components as light (a prism!). On a photonic SIDRA Y100, Fourier could be “free”.
Further Reading
- Next chapter: 4.6 — Quantization and Quantization Error
- Previous: 4.4 — Probability and Noise
- Classical reference: Oppenheim & Schafer, Discrete-Time Signal Processing.
- Fourier math: Bracewell, The Fourier Transform and Its Applications.
- FFT discovery: Cooley & Tukey, An algorithm for the machine calculation of complex Fourier series, Math. Comput. 1965.
- Fourier Neural Operators: Li et al., Fourier Neural Operator for Parametric Partial Differential Equations, ICLR 2021.
- FNet (spectral attention): Lee-Thorp et al., FNet: Mixing Tokens with Fourier Transforms, NAACL 2022.