🔌 Module 5 · Chip Hardware · Chapter 5.10 · 10 min read

Noise Models

SIDRA's analog reality — controlling noise by design.

Prerequisites

What you'll learn here

Name SIDRA's six noise sources (thermal, shot, 1/f, programming, IR drop, drift)
Apply the noise-model math (σ formulas)
Compute MVM SNR at cell/column/crossbar levels
Explain noise-aware compiler strategies
Validate Y1's tolerable noise budget against practical AI models

Hook: Analog = Noisy

Digital CMOS: a bit is 0 or 1. Noise = 0. SIDRA analog: continuous current. Every read is noisy.

We covered noise theory in 4.4. This chapter gives the practical SIDRA noise model: how many sources, which dominates, how to compute.

Bottom line: Y1 ~5% RMS noise → 6 effective bits → enough for INT8 AI. Y10 ~2% → 8 bits.

Intuition: 6 Noise Sources

When reading a SIDRA cell:

Source	Chapter	Typical magnitude
1. Thermal (Johnson)	4.4	~5 nA RMS
2. Shot	4.4	~5-20 nA RMS
3. 1/f (flicker)	4.4	~10 nA RMS long-term
4. Programming	5.5 ISPP	~5% absolute (~50 nA @ 1 µA)
5. IR drop	5.12	~5% systematic
6. Drift	5.2	~1% / year

Total (per cell): $\sigma \approx \sqrt{5^2 + 10^2 + 10^2 + 50^2 + 50^2} \approx 75$ nA (~7.5% on a 1 µA signal).

That’s low effective bits (~7). Crossbar averaging does better.

Formalism: Six Noise Sources

L1 · Başlangıç

1. Thermal (Johnson-Nyquist): $\sigma_T^2 = 4 k T G \Delta f$ Typical SIDRA: 5-10 nA.

2. Shot: $\sigma_S^2 = 2 q I \Delta f$ Dominant at low currents: 5-20 nA.

3. 1/f (flicker): $S_I(f) = K I^2 / f$ Long-term drift source. RMS 10 nA.

4. Programming (post-ISPP):

With ISPP, $\sigma_G \approx 1\%$ → $\sigma_I = \sigma_G \cdot V$ = 1 nA @ 100 µS, 0.25 V. Small.

Without ISPP (single-pulse) ~5% → 50 nA.

5. IR drop:

End-of-WL cells see less voltage. Systematic error 5%. Fix: double-ended drive (chapter 5.12).

6. Drift:

Retention is finite. Typical 1% drift per year. Annual refresh.

Combined model: $\sigma_{\text{total}}^2 = \sigma_T^2 + \sigma_S^2 + \sigma_{1/f}^2 + \sigma_P^2 + \sigma_{IR}^2 + \sigma_D^2$

Independent-source assumption. In practice 1/f and drift partially correlated.

L2 · Tam

MVM-level SNR:

A crossbar reads 256 columns in parallel. Signal sums as N², noise as N (independent) → SNR rises by N.

Cell SNR (single read):

Signal 1 µA, noise 75 nA (all sources) → SNR = 13 → 22 dB.

Column SNR:

Signal = $\sum G_i V_i \approx 256 \cdot \bar{I_i}$ = ~256 µA. Noise $\sqrt{256} \cdot 75$ = 1.2 µA. SNR = 256/1.2 = 213 → 47 dB.

256-column parallel crossbar:

Single-MVM SNR = 47 dB → ~8 effective bits. Good.

Model-level noise:

Single MVM is 8 bits. But AI models stack 10+ layers → noise compounds. 12-layer GPT-2: σ_out = √12 · σ_layer = 3.5 × 5% = 17%. Still tolerable (classification margin 20-50%).

Tolerance:

AI models tolerate noise. 5-10% RMS is acceptable. SIDRA operates in this band.

Averaging improvement:

4× reads → noise drops by √4 = 2×. Single MVM 15 ns → 4 reads 60 ns. Throughput 4× lower but SNR +6 dB (9 effective bits).

Averaging on critical layers, single read on non-critical. Compiler decision.

L3 · Derin

Noise-aware compiler (chapter 6.7):

The compiler analyzes model weights:

Big weights (|w| > 0.5): sensitive → SIDRA programming, averaging.
Small weights (|w| < 0.1): unimportant → prune, don’t write to crossbar.
Mid weights: standard 8-bit.

Whisper example:

30% pruning → smaller model, noise impact reduced.

Noise-injection training (QAT + NI):

Add ~5% noise to weights during training → model becomes robust. SIDRA inference noise is already learned.

Accuracy: standard INT8 76% → noise-injection training 76.5% (small gain).

Temperature effects:

HfO₂ conductance: $G(T) = G_0 \exp(-E_a/k T)$ . $E_a = 0.2$ eV typical → 25°C → 85°C swing 40%.

Calibration: thermal sensor per CU, calibrate per cluster. Temperature-aware scale factor.

Noise + temperature + drift = composite model:

The SIDRA simulator (chapter 6.8) simulates all. Used for model testing/validation.

Real Y1 estimate:

Lab measurements:

Single cell σ = 75 nA (lab).
Column σ (256 parallel) = 1.2 µA.
MVM effective bits = 8.
Model accuracy loss (INT8 benchmark) = 0.3-0.5%.

Y10 target: σ → 30% (ISPP improvement + tight layout), 10 effective bits.

Experiment: MNIST Inference Noise Impact

Model: MLP 784 → 128 → 10. 2 layers.

Noise:

Layer 1: σ = 5% relative.
Layer 2: σ = 5% relative.
Combined: σ_out = √(5² + 5²) = 7% relative.

MNIST accuracy:

FP32: 98%
INT8 quantized: 97.8%
INT8 + 5% noise (SIDRA): 97.5%
INT8 + 10% noise: 96.5%

Averaging effect (4×):

5% → 2.5%.
New accuracy: 97.9% (very close to FP32).

Y10 (2% noise):

Single read 97.9%.
No averaging needed.

Energy:

Y1 + 4× averaging: 4 mJ × 4 = 16 mJ.
Y10 single read: 3 mJ. 5× more efficient.

Quick Quiz

1/6How many main noise sources in SIDRA?

136 (thermal, shot, 1/f, programming, IR, drift)100

Lab Exercise

Y1 noise budget limits.

Y1:

Cell σ = 75 nA.
Column σ = 1.2 µA.
MVM effective bits = 8.

Questions:

(a) Total noise for ResNet-50 (50 layers)? (b) Tolerable accuracy drop on ImageNet? (c) Which layers should use 4× averaging? (d) What does 25 → 85°C swing do? (e) What’s the Y10 σ target?

Solutions

(a) 50 layers × 5% RMS = 35% total (worst case; in practice the model tolerates much).

(b) ImageNet FP32 76% → SIDRA INT8 noise ~74-75% (1-2% drop).

(d) Temperature-aware scaling: $G_{\text{actual}} / G_{\text{target}}(T)$ . Compiler does periodic calibration. Net effect < 1%.

(e) Y10 σ target ~2% (ISPP improvements + tighter layout). 10 effective bits. Noise drops → accuracy holds at FP32 levels.

Cheat Sheet

6 noise sources: thermal, shot, 1/f, programming, IR drop, drift.
Cell σ: ~75 nA (~7% on a 1 µA signal).
Column SNR: × N = 24 dB improvement.
Effective MVM: ~8 bits in Y1.
AI tolerance: 5-10% noise → 0.5-2% accuracy loss.
Mitigation: averaging, noise-injection training, temperature calibration.
Y10 target: σ 2%, 10 effective bits.

Vision: Make Noise a Feature

Y1: Noise tolerated.
Y3: Noise-aware compiler, weight-specific design.
Y10: Controlled-stochastic memristor (noise level tunable).
Y100: Noise as regularizer (Bayesian NN, MCMC).
Y1000: Noise = compute (probabilistic AI).

Prerequisites

What you'll learn here

🪝 Hook: Analog = Noisy

🧭 Intuition: 6 Noise Sources

📐 Formalism: Six Noise Sources

🧪 Experiment: MNIST Inference Noise Impact

📝 Quick Quiz

🛠️ Lab Exercise

🗂️ Cheat Sheet

🔮 Vision: Make Noise a Feature

📚 Further Reading