💻 Module 6 · Software Stack · Chapter 6.8 · 9 min read

Digital Twin / Simulator

Develop for SIDRA without SIDRA — hardware simulation.

Prerequisites

6.7 — Compiler: Model → Analog Mapping

What you'll learn here

Explain the digital-twin concept and why it's necessary
Distinguish the three SIDRA simulator levels (fast, accurate, cycle)
Run hardware models (noise, IR drop, thermal) in the simulator
Summarize the QAT + simulator feedback loop
Identify the simulator's role in pre-production verification

Hook: Software Ready Before the Chip

Y1’s physical prototype is 2026. But software development has been underway since 2023. How?

Digital twin: a software model of SIDRA hardware. Code + model can be tested without real silicon.

Levels:

Algorithm correctness test (fast sim).
Hardware behavior validation (accurate sim).
Cycle-accurate timing (cycle sim).

This chapter details each level and practical use.

Intuition: 3 Simulator Levels

Fast Simulator (Python/C)
    ~100× slower than real-time
    "Functional correctness" — what the model does
    
Accurate Simulator (C++)
    ~10000× slower
    "Hardware behavior" — noise, IR drop, quantization
    
Cycle-accurate (Verilog/SystemC RTL)
    ~10^6× slower
    "Full silicon behavior" — timing, power

Each level answers different questions.

Formalism: Simulator Implementation

L1 · Başlangıç

Fast simulator:

class FastSimulator:
    def __init__(self, compiled_model):
        self.weights = compiled_model.weights  # INT8
        self.layers = compiled_model.layers
    
    def infer(self, x):
        for layer in self.layers:
            if layer.type == "MVM":
                w = self.weights[layer.idx]
                x = np.dot(w, x)  # plain numpy MVM
            elif layer.type == "ReLU":
                x = np.maximum(0, x)
            # ...
        return x

Fast prototyping. No noise, no IR drop. But functional correctness is guaranteed.

Accurate simulator:

class AccurateSimulator(FastSimulator):
    def infer(self, x):
        for layer in self.layers:
            if layer.type == "MVM":
                w = self.weights[layer.idx]
                # Add noise
                noise = np.random.normal(0, 0.05 * w.std(), w.shape)
                # IR drop simulation
                ir_drop = self.estimate_ir_drop(layer)
                x = np.dot(w + noise + ir_drop, x)
                # Quantization error
                x = self.quantize(x, bits=8)
            # ...

Realistic accuracy. For model validation. Run hundreds of times (Monte Carlo).

L2 · Tam

Cycle-accurate simulator:

RTL (Register Transfer Level) Verilog/SystemC. Simulates each clock cycle:

always @(posedge clk) begin
    if (mvm_start) begin
        // All crossbar cells in parallel
        for (int c = 0; c < 256; c++)
            current[c] = sum(weights[row][c] * voltage[row]);
        // ...
    end
end

Very slow (1 s of simulation = 10 µs real time) but 100% hardware verification.

Verilator is the open-source alternative (commercial: Synopsys VCS, Cadence Xcelium).

Monte Carlo analysis:

Noise is random → run 100 inferences, observe the accuracy distribution:

accuracies = []
for _ in range(100):
    sim = AccurateSimulator(model, seed=random())
    acc = sim.benchmark(test_data)
    accuracies.append(acc)

print(f"Mean: {mean(accuracies):.2%}, Std: {std(accuracies):.2%}")
# Output: Mean: 97.5%, Std: 0.3%

Worst-case analysis:

Simulate scenarios where the model is sensitive (high temperature, max IR drop, aged device). SIDRA gating criterion: worst-case 95% accuracy.

FPGA prototyping:

If the cycle-accurate sim is too slow, use FPGA. The SIDRA RTL is loaded onto Xilinx/Intel FPGA — it’s 10× slower than real-time but behavioral.

The SIDRA Y1 FPGA prototype was completed in 2025. The software stack was tested on top.

L3 · Derin

Digital twin ecosystem:

PyTorch Model
    ↓
Compiler (6.7)
    ↓ compile
Compiled Binary
    ↓ deploy
[Fast Simulator]  OR  [Accurate Simulator]  OR  [FPGA]  OR  [Real Y1]

A developer uses the same binary across targets. As fidelity grows:

Fast: prototype.
Accurate: pre-release.
FPGA: production validation.
Real Y1: deployment.

QAT + simulator loop:

while not acceptable:
    model = train(model)
    model = quantize(model, method="qat")
    compiled = compile(model)
    acc = accurate_simulator.benchmark(compiled)
    if acc < target:
        # Noise injection training
        train_with_noise(model, noise_std=0.05)

Multi-Y1 simulation:

For multi-chip models, the simulator handles independent chips + models PCIe delay.

Performance model:

Simulator also estimates performance:

sim.estimate_latency(model)  # "5 ms / inference"
sim.estimate_energy(model)   # "20 mJ / inference"
sim.estimate_throughput()    # "3000 inferences / second"

Pre-sale spec for customers.

Open source:

SIDRA fast simulator is MIT-licensed — shared with everyone. Researchers can develop without SIDRA hardware.

GitHub: sidra/sidra-sim. PyPI: pip install sidra-sim.

Turkish academic use:

With the simulator open, Turkish students can write SIDRA theses and publish papers without hardware access. A catalyst for the Turkish academic ecosystem.

Experiment: MNIST on the Simulator

import sidra.simulator as sim
import torch

# Load model
model = torch.load("mnist_mlp.pth")
compiled = sidra.compile(model, target="y1")

# Fast simulator
fast_sim = sim.FastSimulator(compiled)
acc_fast = fast_sim.benchmark(mnist_test)
print(f"Fast sim: {acc_fast:.2%}")  # 97.8%

# Accurate simulator
acc_sim = sim.AccurateSimulator(compiled, 
    noise_model="y1_typical",
    ir_drop=True, temperature=300)
acc_acc = acc_sim.benchmark(mnist_test)
print(f"Accurate sim: {acc_acc:.2%}")  # 97.2%

# 100 Monte Carlo
accs = [sim.AccurateSimulator(compiled, seed=i).benchmark(mnist_test)
         for i in range(100)]
print(f"MC: {mean(accs):.2%} ± {std(accs):.2%}")  # 97.1% ± 0.3%

Confidence before actual Y1 deploy.

Quick Quiz

1/6What does a digital twin do?

A gameTest software + model without hardware; pre-production validationEncryptionTest

Lab Exercise

Model-hardware co-design via the simulator.

Scenario: deploy BERT-base to Y1.

Steps:

Train FP32: 88% GLUE.
Fast sim (no noise): 87% (quantization only).
Accurate sim: 85% (with 5% noise).
QAT + noise injection: model retrain.
Accurate sim again: 87% (hit target).
FPGA test: 87% confirmed.
Y1 deploy: 87% (real).

Cycle: 2 weeks. Model ready before hardware.

Cheat Sheet

Digital twin: test software + model before hardware.
3 levels: fast, accurate, cycle-accurate.
Accurate sim: noise + IR drop + quantization + thermal.
Monte Carlo: random noise distribution.
FPGA prototyping: cycle-accurate but near-real-time.
QAT + sim loop: accuracy guarantee.
Open source: SIDRA fast sim MIT.

Vision: Simulator Future

Y1: Python fast sim + C++ accurate.
Y3: GPU-accelerated sim (100× faster).
Y10: ML-predicted sim (neural net predicts timing/energy).
Y100: SIDRA atop SIDRA simulator (chip itself is the simulator).
Y1000: quantum-hybrid simulator.

For Türkiye: an open-source simulator → an explosion in Turkish academic publication. Post-2027 SIDRA theses published internationally.

Prerequisites

What you'll learn here

🪝 Hook: Software Ready Before the Chip

🧭 Intuition: 3 Simulator Levels

📐 Formalism: Simulator Implementation

🧪 Experiment: MNIST on the Simulator

📝 Quick Quiz

🛠️ Lab Exercise

🗂️ Cheat Sheet

🔮 Vision: Simulator Future

📚 Further Reading