📐 Module 4 · The Math Arsenal · Chapter 4.2 · 12 min read

Ohm + Kirchhoff = Analog MVM

Two laws produce one matrix multiply — SIDRA's physics-algebra bridge.

Prerequisites

What you'll learn here

Show that Ohm's law I = G·V is a multiplication operation at the cell level
Explain that Kirchhoff's current law (KCL) is an addition operation
Derive how the two laws together produce an N×N MVM in a single crossbar
Run an MVM by hand on a 3×3 SIDRA crossbar
List the main error sources in analog MVM (programming, noise, drift, IR drop)

Hook: 1827 + 1845 = 2026 SIDRA

Two foundational electrical laws, almost 200 years old:

Ohm’s law (1827): $V = I \cdot R$ or $I = G \cdot V$ .
Kirchhoff’s current law, KCL (1845): the current entering a node equals the current leaving it.

Two sentences. Both familiar from high-school physics. But put them together and they physically produce the densest mathematical operation in modern AI (matrix-vector multiply). No computation — automatic, by physical law.

That’s SIDRA’s core trick. Apply input voltages to a 256×256 crossbar → the output currents are immediately the 256-dimensional MVM result. No clock cycle. No CPU/GPU instruction. Just electrons obeying Ohm + KCL.

This chapter places the two laws side by side, overlays them, and builds crossbar arithmetic from scratch. The MVM we saw in 4.1 gets cast into silicon here.

Intuition: One Cell Multiplies, One Column Sums

A single memristor cell:

A memristor with conductance $G$ . Apply a voltage $V$ . Ohm says:

I = G \cdot V

This is a multiplication. $G$ and $V$ are two numbers; $I$ is their product.

$G = 100$ µS, $V = 0.1$ V → $I = 10$ µA.
$G$ is the “weight” (programmed); $V$ is the “data” (live).
The multiplication is done by physics — no transistor switched.

256 memristors in one column:

256 rows, one column. Voltage $V_i$ applied to row $i$ ; each cell sources current $G_{ij} V_i$ . All flow into the same column wire. Kirchhoff says:

I_{\text{col}} = \sum_{i=1}^{256} G_{ij} V_i

This is a dot product — vector multiply. The same operation we defined in 4.1.

256 columns in parallel:

256 columns, all reading the same 256 inputs. Each column produces a different dot product → a 256-dim output vector. This is the 256×256 matrix × 256-vector = MVM.

One step: 65,536 multiplies + 256 × 255 = 65,280 adds. All parallel, ~10 ns. A CPU takes 130,000 ns for the same job.

Bottom line: the crossbar = an analog MVM engine. Each cell is a synapse (3.7), each column a neuron’s synaptic integration. Physical laws do the math for free.

Formalism: From Two Laws to the Crossbar Equation

L1 · Başlangıç

Ohm’s law (compact):

I = G \cdot V

$G$ = conductance (Siemens, S = 1/Ω). Memristor range: 0.1-100 µS typical.
$V$ = voltage (V).
$I$ = current (A).

Kirchhoff’s current law (KCL):

The sum of currents into a node equals the sum of currents leaving:

\sum I_{\text{in}} = \sum I_{\text{out}}

Equivalent: the algebraic sum of all currents at a node is zero.

Crossbar geometry:

       Col 1   Col 2   ...   Col M
        |       |             |
Row 1 -[G11]---[G12]----...--[G1M]---
        |       |             |
Row 2 -[G21]---[G22]----...--[G2M]---
        |       |             |
  ...
        |       |             |
Row N -[GN1]---[GN2]----...--[GNM]---
        |       |             |
        I1      I2     ...    IM (column currents, measured)

Horizontal wires (rows): voltage $V_i$ applied ( $i = 1..N$ ).
Vertical wires (columns): current $I_j$ collected ( $j = 1..M$ ).
Memristor $G_{ij}$ at the intersection.

Total current in column $j$ (Ohm + KCL):

I_j = \sum_{i=1}^{N} G_{ij} \cdot V_i

Vector form:

\mathbf{I} = \mathbf{G}^\top \mathbf{V}

This is the MVM. Matrix $\mathbf{G}^\top \in \mathbb{R}^{M \times N}$ holds weights; vector $\mathbf{V} \in \mathbb{R}^N$ is the input.

L2 · Tam

Practical SIDRA Y1 sizes:

$N = M = 256$ (256×256 crossbar).
$G_{ij}$ → 256 discrete levels (8 bit). Range: $G_{\min} = 1$ µS, $G_{\max} = 100$ µS.
$V_i$ → from a DAC. 8-bit input typical. Range: 0 V — 0.5 V (won’t trigger memristor SET).
$I_j$ → into an ADC. 8 or 12-bit quantization.

MVM energy budget:

256 rows × 0.5 V (max) = 128 V·row total (if all active).
Each cell typical 50 µS × 0.25 V = 12.5 µA → 256² = 65K cells × 12.5 µA = ~800 mA total (worst case). With sparsity, ~10× lower in practice.
One MVM lasts 10 ns → energy ~80 mA × 0.25 V × 10 ns = 200 pJ worst case. Typical ~26 pJ.

Bit accuracy:

Crossbar: 8-bit programming × 8-bit input = theoretical 16-bit product. But:

ADC quantization: 8-bit ADC → 8-bit output (0-255).
Noise (thermal, shot, drift): ~4-6 bit effective SNR.

Practical effective accuracy: ~6 bits. Plenty for modern AI inference (INT8 quantization is standard). Workloads needing FP32 don’t suit the crossbar.

Reading the output current:

The ADC turns raw current into an integer: $I_j = k \cdot y_j$ , $k$ a calibration constant (mA → integer). With $G_{\max} = 100$ µS, $V_{\max} = 0.5$ V, 256 rows → $I_{\max} = 256 \cdot 100 \cdot 0.5 = 12.8$ mA. The ADC maps this to 0-255.

L3 · Derin

Error sources (analog MVM weaknesses):

Programming error: can’t program a memristor exactly to the target $G$ . ~ $\pm 5\%$ variation typical. ISPP (Incremental Step Pulse Programming) brings this to $\pm 1\%$ (chapter 6.4).
Thermal (Johnson) noise: $\sigma_I = \sqrt{4 k T G \Delta f}$ . With 256 cells in parallel, total noise scales $\sqrt{N} \cdot \sigma$ . SNR usually 30-40 dB.
Shot noise: $\sigma_I = \sqrt{2 q I \Delta f}$ . Dominant at low currents (HRS cells).
Drift / 1/f noise: memristor conductance changes slowly. Retention concerns at 10 ms — 1 year scales (chapter 5.10).
IR drop: voltage drops along long copper wires. $V_i$ doesn’t reach the row end; 0.5 V at the start, 0.45 V at the end. Output currents skew (chapter 5.12).
Sneak-path currents: without a selector (1S1R), unwanted current bleeds through half-selected cells. 1T1R + 1S1R fixes (chapters 5.2-5.3).
Temperature dependence: $G(T)$ Arrhenius dependence. 25°C → 85°C shifts conductance 20-50%. Temperature calibration mandatory.

Total effective SNR: ~30 dB → ~5 effective bits (signal/noise ratio 32). Enough for classification; for regression, additional techniques (averaging, error correction) needed.

Countermeasures:

ISPP programming (5.5)
Temperature compensation (5.10)
Multi-read + averaging (3-10 samples)
Online calibration (a reference MVM every N inferences)
ECC (Error Correction Code) — new techniques for analog (5.8)

Crossbar’s mathematical perfection vs reality:

Ideal: $I = G^\top V$ , infinite precision, zero noise.

Real: $I = G^\top V + \epsilon$ , $\epsilon$ ~5% normal error. Acceptable for AI inference; harder for training (gradient drift).

Experiment: Manual MVM on a 3×3 Crossbar

3-row × 3-column crossbar. Conductances (μS):

\mathbf{G} = \begin{bmatrix} 50 & 20 & 30 \\ 10 & 80 & 25 \\ 40 & 15 & 60 \end{bmatrix}

Input voltages (V):

\mathbf{V} = \begin{bmatrix} 0.2 \\ 0.4 \\ 0.1 \end{bmatrix}

Column 1 current:

$I_1 = G_{11} V_1 + G_{21} V_2 + G_{31} V_3 = 50 \cdot 0.2 + 10 \cdot 0.4 + 40 \cdot 0.1 = 10 + 4 + 4 = 18$ µA

Column 2:

$I_2 = 20 \cdot 0.2 + 80 \cdot 0.4 + 15 \cdot 0.1 = 4 + 32 + 1.5 = 37.5$ µA

Column 3:

$I_3 = 30 \cdot 0.2 + 25 \cdot 0.4 + 60 \cdot 0.1 = 6 + 10 + 6 = 22$ µA

Output vector:

\mathbf{I} = \begin{bmatrix} 18 \\ 37.5 \\ 22 \end{bmatrix} \text{µA}

After ADC (calibration $k = 0.5$ µA/integer):

\mathbf{y} = \begin{bmatrix} 36 \\ 75 \\ 44 \end{bmatrix}

This is the output of a 3-neuron MLP layer (pre-activation). Three neurons, each with three synapses, in parallel.

Time: 10 ns. Energy: $(18+37.5+22) \cdot 10^{-6} \cdot 0.25 \cdot 10 \cdot 10^{-9} \approx 200$ fJ. 9 MACs at 200 fJ → 22 fJ/MAC = 45 TOPS/W (this small example; lower at 256×256 because ADC overhead grows).

Comparison: the same job with a 32-bit MAC digital circuit costs ~5-10 pJ. The SIDRA crossbar is 200-500× more efficient.

Quick Quiz

1/6What operation does Ohm's law perform on a single memristor cell?

AdditionMultiplication: I = G · V (conductance × voltage)DivisionDerivative

Lab Exercise

Energy efficiency analysis for a SIDRA Y1 256×256 crossbar.

Data:

Crossbar: 256×256 = 65,536 memristors.
Typical MVM energy: 26 pJ.
ADC energy: 1 pJ × 256 columns = 256 pJ (huge overhead!).
DAC energy: 0.5 pJ × 256 rows = 128 pJ.
Peripheral control: 50 pJ.
Total MVM energy: ~460 pJ (crossbar 6%, ADC 56%, DAC 28%, control 10%).
MVM time: 10 ns (crossbar settling) + 5 ns (ADC convert) = 15 ns.

Questions:

(a) Effective TOPS/W per MVM? (b) ADC energy dwarfs the crossbar. How to reduce it? (c) Y10 target: 6-bit ADC + on-chip averaging instead of 8-bit. New ADC energy? Effective efficiency? (d) GPT-2 small inference (~250 MFLOPS) on Y1: how long? Total energy? (e) Same GPT-2 inference on H100 takes microseconds at 700 W. In which metric is SIDRA Y1 ahead?

Solutions

(a) 65,536 MAC / 460 pJ = 1.42 × 10¹¹ MAC/J = 142 GOPS/W ≈ 0.14 TOPS/W (with ADC overhead). Real SIDRA Y1 is higher (10 TOPS/W) because ADC is more efficient + multi-MVM amortized.

(b) ADC reduction strategies: (1) lower resolution (6-bit is enough), (2) read only active columns (sparsity), (3) sample-and-hold for time-multiplexing, (4) time-domain ADC (chapter 5.6 TDC).

(c) 6-bit ADC: ~0.25 pJ × 256 = 64 pJ (4× drop). + on-chip averaging (4 samples): ~250 pJ. Total MVM: 26 + 64 + 128 + 50 + 80 (averaging) = 350 pJ. Efficiency: 65K / 350 = 187 GOPS/W. Y10 target design is more aggressive.

(d) GPT-2 inference: ~250 MFLOPS × 1 token. SIDRA Y1 30 TOPS → 250 MFLOPS / 30 TOPS = ~8 ms / token. Energy: 8 ms × 3 W = 24 mJ.

(e) H100 GPT-2 inference: ~1 µs / token, 700 W → energy ~700 µJ. SIDRA: 8 ms / token, 24 mJ. Latency: H100 8000× faster. Energy: SIDRA ~30× more efficient (24 mJ / 700 µJ ≈ 34). I.e., SIDRA wins at the edge with small batches; H100 wins in the data center on throughput.

Conclusion: the right category for each chip is different. SIDRA = edge inference, H100 = batch training. They run side-by-side, not as substitutes.

Cheat Sheet

Ohm = multiply: $I = G \cdot V$ . Single memristor cell.
KCL = add: column current = $\sum$ cell currents.
Together = MVM: $\mathbf{I} = \mathbf{G}^\top \mathbf{V}$ . 256×256 crossbar = 65K MACs in parallel.
Time: ~10 ns (Ohmic settling).
Energy: ~26 pJ crossbar + ADC/DAC overhead.
Effective accuracy: ~5-6 bits (8-bit program, noise losses).
Error sources: programming (~5%), thermal/shot noise, drift, IR drop, sneak-path, temperature.
Win: ~100-500× better energy than digital MAC.

Vision: From Pure Analog to Hybrid Architecture

Pure analog MVM (Ohm + KCL only) is limited: precision, scale, stability. The future SIDRA architecture is hybrid — analog crossbar + digital periphery + algorithmic compensation:

Y1 (today): Pure analog MVM, digital ADC/DAC. ~10 TOPS/W, 8-bit effective.
Y3 (2027): TDC (Time-to-Digital Converter) read — chapter 5.6. ADC overhead drops 50%. Effective 9-bit.
Y10 (2029): Multi-bit cell (16 → 256 → 1024 levels). Temperature-aware calibration. Effective 10-12 bit.
Y100 (2031+): Photonic MVM + electronic combination. Hybrid Ohmic + optical interferometric. ~1000 TOPS/W. Full analog FP16 equivalent.
Y1000 (long horizon): Superconducting (4 K) crossbar. Zero resistance → no IR drop, very low noise. Effective 16+ bit. Quantum-AI hybrid.

Strategic value for Türkiye: analog MVM design demands semiconductor know-how — circuit design, materials, instrumentation. Türkiye has a natural fit ASIC ecosystem (TÜBİTAK BİLGEM, universities, ASELSAN). The SIDRA workshop is the first concrete platform that consolidates this know-how.

Unexpected future: a crossbar-based computer. Today the crossbar is an AI accelerator. By the 2030s, crossbar = main CPU + memory + AI on a single die. The von Neumann split (CPU ↔ RAM) goes away. SIDRA could be one of the first major manufacturers of that paradigm shift.

Prerequisites

What you'll learn here

🪝 Hook: 1827 + 1845 = 2026 SIDRA

🧭 Intuition: One Cell Multiplies, One Column Sums

📐 Formalism: From Two Laws to the Crossbar Equation

🧪 Experiment: Manual MVM on a 3×3 Crossbar

📝 Quick Quiz

🛠️ Lab Exercise

🗂️ Cheat Sheet

🔮 Vision: From Pure Analog to Hybrid Architecture

📚 Further Reading