🧠 Module 3 · From Biology to Algorithm · Chapter 3.8 · 13 min read

Spike-Timing-Dependent Plasticity (STDP)

The math of 20 milliseconds — Hebb's time-asymmetric successor.

What you'll learn here

  • Draw the asymmetric STDP learning window ($\Delta t > 0$ → LTP, $< 0$ → LTD)
  • Recall the Bi & Poo (1998) experiment and the τ ≈ 20 ms time constant
  • Write STDP as an equation: $\Delta w = A_+ e^{-\Delta t / \tau_+}$ or $-A_- e^{\Delta t / \tau_-}$
  • Explain STDP's biological basis (NMDA + Ca²⁺ cascade asymmetry)
  • Sketch how to implement STDP on a SIDRA memristor via time-coded voltage pulse pairs

Hook: 1998's 20 Milliseconds

In 1998 Bi and Poo, in Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, discovered a precise timing rule in pairs of hippocampal neurons:

  • If the presynaptic spike arrives BEFORE the postsynaptic spike within 20 ms: the synapse strengthens (LTP).
  • If the presynaptic spike arrives AFTER the postsynaptic spike within 20 ms: the synapse weakens (LTD).
  • If the gap exceeds ~50 ms: no change.

This rule is the time-asymmetric version of Hebbian learning: not “cells that fire together wire together” but “the cell that fires first is the cause”.

Intuition: if neuron A is playing a role in firing B in time (A first, then B), then A→B should strengthen. That’s the biological foundation of causal learning.

40 years on: modern reinforcement learning, world models, video-prediction systems — all investigate STDP variants. SIDRA Y100 target: run STDP natively in memristor hardware.

Intuition: An Asymmetric Learning Window

The STDP learning window looks like this:

   Δw

+A+ |    ●
    |   ●
    |  ●
    | ●●
    |●
────┼──────●●●●● → Δt (post − pre, ms)
    |        ●
    |          ●
    |            ●
    |              ●
-A- |________________
    -50  -20   0   +20  +50
  • Right side (Δt=tposttpre>0\Delta t = t_{\text{post}} - t_{\text{pre}} > 0): pre → post order → LTP.
  • Left side (Δt<0\Delta t < 0): post → pre order → LTD.
  • Window scale τ20\tau \approx 20 ms: typical for the human brain.

Peak values:

  • LTP peak: Δw+A++0.5\Delta w \approx +A_+ \approx +0.5 (normalized weight units).
  • LTD peak: ΔwA0.5\Delta w \approx -A_- \approx -0.5 (usually slightly bigger than LTP).

Why asymmetric? Cause matters. If A always fires after B, A can’t be B’s cause → weaken the link. If A always fires before B, A might be triggering B → strengthen. STDP teaches the brain causality.

Difference from Hebbian: pure Hebbian is order-independent (Δwxy\Delta w \propto x \cdot y; order doesn’t matter). STDP is order-dependent. Hebbian = statistical correlation; STDP = causal ordered coupling.

Formalism: The STDP Equation and Its Biology

L1 · Başlangıç

Standard STDP equation:

Δw(Δt)={+A+eΔt/τ+if Δt>0 (pre first, post second)AeΔt/τif Δt<0 (post first, pre second)0if Δt=0\Delta w(\Delta t) = \begin{cases} +A_+ \cdot e^{-\Delta t / \tau_+} & \text{if } \Delta t > 0 \text{ (pre first, post second)} \\ -A_- \cdot e^{\Delta t / \tau_-} & \text{if } \Delta t < 0 \text{ (post first, pre second)} \\ 0 & \text{if } \Delta t = 0 \end{cases}
  • Δt=tposttpre\Delta t = t_{\text{post}} - t_{\text{pre}} (ms)
  • A+,AA_+, A_- — peak amplitudes (typically 0.001-0.01, normalized)
  • τ+,τ\tau_+, \tau_- — time constants (~10-50 ms)

Bi & Poo values:

  • τ+17\tau_+ \approx 17 ms
  • τ34\tau_- \approx 34 ms (LTD side is a bit wider)
  • Often A>A+A_- > A_+ (LTD dominates → average balance)
L2 · Tam

Biological basis — why this window?

Recall the NMDA receptor (3.2): glutamate + post-depolarization together → Ca²⁺ flows. The amount of Ca²⁺ depends on timing:

Pre first, post second (Δt>0\Delta t > 0):

  1. Pre spike → glutamate released.
  2. Glutamate binds NMDA.
  3. A few ms later, post spike → Mg²⁺ block lifts.
  4. NMDA open + glutamate present → large Ca²⁺ influx.
  5. High, sudden Ca²⁺ → CaMKII → AMPA insertion → LTP.

Post first, pre second (Δt<0\Delta t < 0):

  1. Post spike → membrane depolarized, but no glutamate yet.
  2. Pre spike → glutamate released, binds NMDA.
  3. But post depolarization is already over → Mg²⁺ re-blocks.
  4. Little Ca²⁺ flows (only AMPA), and in a prolonged weak flow → calcineurin → AMPA endocytosis → LTD.

Key: the same NMDA + Ca²⁺ cascade produces both LTP and LTD depending on timing. The gate: Δt\Delta t.

Multi-spike STDP (real):

The above describes single-pre / single-post experiments. Real neurons fire in bursts. Multi-spike STDP:

  • Triplet STDP (Pfister & Gerstner 2006): specific behavior on pre-post-pre or post-pre-post patterns.
  • Voltage-based STDP (Clopath et al. 2010): looks at the whole postsynaptic membrane voltage, not just spike times.
  • Calcium-based STDP (Graupner & Brunel 2012): models Ca²⁺ concentration directly.

All of these generalize BCM (3.3) — multi-timescale plasticity.

L3 · Derin

Functional consequences of STDP:

1. Sequence learning: STDP is naturally suited for temporal sequences. Spiking neural networks (SNNs) use it for language, music, motor learning.

2. Sparse representation: inactive post-neurons receive no LTP; STDP naturally biases toward sparse coding.

3. Unstable dynamics: pure STDP is as unstable as pure Hebbian. Fixes: synaptic scaling, BCM-style normalization, homeostatic STDP variants.

4. Combining with reinforcement learning: a third factor (dopamine) gates STDP → R-STDP. STDP is on at reward moments, otherwise off. The brain’s “reinforce good behavior” mechanism.

STDP implementation on SIDRA:

The natural ΔGVpreVpost\Delta G \propto V_{\text{pre}} \cdot V_{\text{post}} behavior of the memristor can deliver STDP — but not directly. It needs time-coded voltage pulse pairs:

Scheme (Y10 target):

  1. Pre spike → short positive pulse on the presynaptic electrode (e.g. +V/2, 10 ns).
  2. Δt later, post spike → short negative pulse on the postsynaptic electrode (-V/2, 10 ns).
  3. If they overlap: the memristor sees V_pre - V_post = +V → SET (LTP).
  4. If post is first and pre is second: memristor sees -V → RESET (LTD).
  5. The overlap window is shaped by pulse width → the STDP τ is emulated.

This approach is used by IBM, Intel, and other neuromorphic groups:

  • IBM TrueNorth (2014): spike-based but no STDP; weights loaded post-training.
  • Intel Loihi (2018): spikes + on-chip STDP. CMOS-based.
  • SpiNNaker (Manchester): software-emulated STDP.
  • SIDRA Y100 target: native STDP in memristor hardware — potentially the world’s first large-scale analog STDP chip.

Why STDP for online learning?

Backprop needs: error signal, backward pass, global gradient. Hard in hardware (3.6).

STDP needs: pre + post spike coincidence. Local. Works in a memristor cell without extra circuitry.

Training is “slow” but energy is tiny and hardware fits. Ideal for edge AI.

Experiment: 2 Neurons, 5 Spike Pairs, Synaptic Update

One synapse between two neurons. Initial weight w0=0.5w_0 = 0.5. STDP params: A+=0.1,A=0.12,τ+=τ=20A_+ = 0.1, A_- = 0.12, \tau_+ = \tau_- = 20 ms.

Observe 5 spike pairs:

Pairtpret_{\text{pre}} (ms)tpostt_{\text{post}} (ms)Δt\Delta t (ms)Δw\Delta w
105+5+0.1e5/20=+0.078+0.1 e^{-5/20} = +0.078
210090-100.12e10/20=0.073-0.12 e^{-10/20} = -0.073
3200215+15+0.1e15/20=+0.047+0.1 e^{-15/20} = +0.047
4300280-200.12e20/20=0.044-0.12 e^{-20/20} = -0.044
5400405+5+0.1e5/20=+0.078+0.1 e^{-5/20} = +0.078

Total change: 0.0780.073+0.0470.044+0.078=+0.0860.078 - 0.073 + 0.047 - 0.044 + 0.078 = +0.086.

New weight: w=0.5+0.086=0.586w = 0.5 + 0.086 = 0.586.

Interpretation: 3 of 5 pairs were in the LTP direction (pre first), 2 in LTD (post first). Net gain → connection strengthened. If pre-post were always pre-first, the connection would strengthen far faster.

SIDRA parallel:

  • 5 spike pairs = 5 voltage pulse pairs.
  • Each pair consumes ~1 pJ in the memristor cell (partial STDP update).
  • Total 5 pJ. Vs a full SET (~10 pJ), a partial update is much more efficient.
  • Y100 target: 1 million spike pairs / s / cell → 1 µW per cell. Very low-power edge learning.

Quick Quiz

1/6The core STDP rule?

Lab Exercise

STDP-based sequence learning on the SIDRA Y10 prototype.

Scenario: A 4-input (A, B, C, D), single-output LIF neuron. Each input is tied to one SIDRA memristor (4 synapses). We want to train via STDP to recognize the sequence A → B → C (don’t fire for D).

Data:

  • 4 synapses, initial weights wA=wB=wC=wD=0.25w_A = w_B = w_C = w_D = 0.25
  • STDP: A+=0.05,A=0.06,τ=20A_+ = 0.05, A_- = 0.06, \tau = 20 ms
  • Training: 100 examples. 50% are A-B-C sequences (each 10 ms apart), followed by a post spike. 50% are random D spikes + random post spikes.
  • For each synapse, Δw=\Delta w = \sum STDP_rule

Questions:

(a) First training example (A-B-C → post at 0/10/20/30 ms): per synapse Δt\Delta t and Δw\Delta w? (b) Expected weight distribution after 100 examples? Which synapse gains the most? (c) What does the trained neuron do on an A-B-C sequence? On D? (d) Compared to backprop: when is STDP better? (e) How many parallel sequence-recognition nodes can one SIDRA Y10 crossbar (256×256) host?

Solutions

(a) A: tpre=0,tpost=30t_{\text{pre}} = 0, t_{\text{post}} = 30, Δt=+30\Delta t = +30 ms → Δw=+0.05e30/20=+0.0112\Delta w = +0.05 e^{-30/20} = +0.0112. B: Δt=+20\Delta t = +20+0.0184+0.0184. C: Δt=+10\Delta t = +10+0.0303+0.0303. D: random → ≈ 0. A gains least (farther), C gains most (closest).

(b) 50 of the 100 examples are A-B-C → C’s weight rises fastest (~0.03 × 50 = 1.5 + init 0.25 = 1.75; in practice saturates at 1.0). B: ~1.2. A: ~0.9. D: random → stays ~0.25. Ordering: C > B > A >> D.

(c) On A-B-C: each synapse gets a spike, the membrane depolarizes cumulatively → threshold crossed → post spike. Recognition! On D: one small EPSP, threshold not crossed → no post spike. The neuron has become an “A-B-C detector”.

(d) STDP advantage: no labels during training (unsupervised); just spike coincidences. Backprop needs target labels. STDP is also online (updates on every spike pair), backprop is batch. STDP fits edge AI better.

(e) One crossbar = 256 rows × 256 columns = 256 sequence-recognition neurons. Y1 6400 crossbars → 1.64M sequence-recognition neurons. Each can learn a different sequence → massive unsupervised feature learning. Y10 has 24× → 39M.

Cheat Sheet

  • STDP rule: pre first, post second → LTP; reverse → LTD. Asymmetric timing window τ20\tau \approx 20 ms.
  • Bi & Poo 1998: precise observation in hippocampal neurons.
  • Equation: Δw=+A+eΔt/τ+\Delta w = +A_+ e^{-\Delta t / \tau_+} for Δt>0\Delta t > 0, AeΔt/τ-A_- e^{\Delta t / \tau_-} for Δt<0\Delta t < 0.
  • Biology: NMDA + Ca²⁺ cascade asymmetry.
  • Upside: time-dependent → learns causal relations (not just correlation).
  • Reinforcement combo: R-STDP (dopamine modulation).
  • SIDRA: time-coded pre/post voltage pulse pairs → natural STDP in memristor hardware. Y100 target.

Vision: STDP-Native Hardware and SIDRA's Brain Claim

STDP is the atom of brain-compatible learning. SIDRA’s ultimate claim is to validate that atom in hardware:

  • Y1 (today): no STDP; weights trained externally on GPU, fixed on chip. Inference-focused.
  • Y3 (2027): software-emulated STDP (CMOS control circuit sends STDP-shaped pulses to the memristor). Prototype scale.
  • Y10 (2029): hardware-native STDP — pre/post pulse pairs update the memristor directly with the STDP rule. Multi-spike variants (triplet, voltage-based).
  • Y100 (2031+): STDP + R-STDP + sparse spike coding + multi-timescale plasticity all at once. Brain-compatible online learning; a GPT-class model trains at the edge.
  • Y1000 (long horizon): bio-compatible organic STDP device + brain implant. Neuralink’s closed-loop AI.

Strategic signal for Türkiye: no commercial STDP-native chip exists today (Intel Loihi, IBM TrueNorth use digital simulation). Memristor-based, true analog STDP hardware is an open category. If we win that category with the SIDRA Y10 prototype, we could be first in the world. A rare category where Türkiye could plausibly lead in AI.

Unexpected future: the continuously-learning home robot. A robot arrives, meets a child, cat, kitchen — STDP lets it learn from every interaction, offline, no GPU. SIDRA Y100 + STDP could be the first commercial system for this. 2032-2035 horizon; Türkiye has the chance to patent the architecture.

Module 3 wrap-up: we went from biology to algorithm, synapse to memristor, Hebb to STDP. Module 4 (Math Arsenal) covers the algebra, probability, and optimization tools beneath this chain. Module 5 (Chip Hardware) turns it into silicon circuits in the SIDRA.

Further Reading

  • Next module: 🚧 4.1 · Vector, Matrix, MVM — Coming soon
  • Previous: 3.7 — Memristor ↔ Synapse Mapping
  • STDP discovery: Bi & Poo, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, J. Neurosci. 1998.
  • Markram priority: Markram, Lübke, Frotscher, Sakmann, Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs, Science 1997.
  • Triplet STDP: Pfister & Gerstner, Triplets of spikes in a model of spike timing-dependent plasticity, J. Neurosci. 2006.
  • Voltage-based STDP: Clopath et al., Connectivity reflects coding: a model of voltage-based STDP…, Nature Neurosci. 2010.
  • R-STDP (dopamine): Izhikevich, Solving the distal reward problem through linkage of STDP and dopamine signaling, Cereb. Cortex 2007.
  • STDP in memristors: Yu et al., An electronic synapse device based on metal oxide resistive switching memory for neuromorphic computation, IEEE TED 2011.
  • Loihi neuromorphic chip: Davies et al., Loihi: A neuromorphic manycore processor with on-chip learning, IEEE Micro 2018.