💻 Module 6 · Software Stack · Chapter 6.10 · 11 min read

End-to-End Production Stack Lab

Module 6's closing dive — a SIDRA app from scratch to deploy.

Prerequisites

6.9 — Test, Calibration, Verification

What you'll learn here

Tie together Module 6's 9 chapters with one end-to-end project
Trace the PyTorch model → SIDRA inference flow step by step
Validate how the 5 software layers interact
Summarize production CI/CD and version management
Prepare for Module 7 (manufacturing)

Hook: The Whole Module 6 Stack

Module 6 covered 9 chapters: driver → kernel module → firmware → ISPP → SDK → PyTorch → compiler → simulator → test. This chapter assembles them in a real-world scenario:

“Deploy a Turkish speech-recognition + translation app on SIDRA Y1.”

Then Module 7 (manufacturing) starts — software ready, time for real wafers.

Intuition: 9 Steps, 1 Product

Scenario: a Turkish startup ships a “Local AI Assistant” product. SIDRA Y1 + ARM CPU + microphone. Whisper-tiny (ASR) + MarianNMT (TR→EN) on SIDRA.

Development steps:

Pick models (HuggingFace Whisper + Marian).
Quantization-aware training (with Turkish data).
Compiler (PyTorch → SIDRA binary).
Simulator validation (accuracy 95%).
FPGA prototype test.
Driver + firmware test.
Y1 prototype deploy.
CI/CD pipeline.
Production and distribution.

Formalism: End-to-End Pipeline

L1 · Başlangıç

Step 1: Pick and download models.

from transformers import WhisperForConditionalGeneration, WhisperProcessor

model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny")
processor = WhisperProcessor.from_pretrained("openai/whisper-tiny")

39M parameters. Uses 9% of SIDRA Y1.

Step 2: Fine-tune with Turkish (CPU/GPU).

import torch
from datasets import load_dataset

tr_dataset = load_dataset("mozilla-foundation/common_voice_13_0", "tr")

# Standard PyTorch training
optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)
for epoch in range(3):
    for batch in tr_dataset:
        loss = model(**batch).loss
        loss.backward()
        optimizer.step()

# Result: Turkish WER 15% (acceptable).

Step 3: QAT (Quantization-Aware Training).

from torch.quantization import prepare_qat, convert

model_qat = prepare_qat(model)
# 1 epoch QAT
for batch in tr_dataset:
    model_qat(**batch).loss.backward()
    optimizer.step()

model_int8 = convert(model_qat)

Result: INT8 model. Accuracy loss 0.5%.

Step 4: SIDRA compile.

import sidra

compiled = sidra.compile(
    model_int8,
    calib_data=tr_dataset[:100],
    target="y1",
    optimization_level=2
)
print(compiled.size_mb, compiled.crossbar_count)
# 78 MB, 600 crossbars

L2 · Tam

Step 5: Simulator validation.

sim = sidra.AccurateSimulator(compiled)
test_wer = sim.benchmark_wer(tr_test_set)
print(f"Sim WER: {test_wer:.1%}")  # 16.5%

Acceptance criterion: WER under 20%. Pass.

Step 6: FPGA test.

Run the same binary on the FPGA prototype. Accuracy confirmed.

fpga = sidra.FPGADevice("xilinx_u280")
fpga.deploy(compiled)
fpga_wer = fpga.benchmark_wer(tr_test_set)
# 16.7% (matches sim)

Step 7: Y1 deploy.

chip = sidra.Chip(0)  # real Y1
chip.deploy(compiled)

# Inference
audio = record_microphone(seconds=5)
text = chip.infer_whisper(audio)
print(text)  # "Merhaba, nasılsın?"

5 seconds of audio → 100 ms inference. 50 mJ energy.

Step 8: Add the translation model.

Marian NMT TR→EN follows a similar pipeline:

nmt = sidra.compile(marian_model, ...)
chip.deploy([whisper_compiled, nmt_compiled])  # two models in parallel

text = chip.infer_whisper(audio)        # TR
translation = chip.infer_marian(text)   # EN
print(translation)  # "Hello, how are you?"

Total pipeline: 200 ms audio → text → translation.

L3 · Derin

Step 9: CI/CD.

GitHub Actions workflow:

name: SIDRA Build & Test
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Install SIDRA SDK
      run: pip install sidra
    - name: Test on simulator
      run: python tests/test_whisper.py
    - name: Test on FPGA (self-hosted runner)
      run: python tests/test_fpga.py
      if: github.ref == 'refs/heads/main'

Auto-tests every commit. PR reviews show CI status.

Version management:

my-app v1.0.0
├── sidra-sdk 1.2.3
│   ├── sidra-driver 1.1.0
│   ├── sidra-firmware 1.0.5
│   └── sidra-compiler 1.2.0
├── pytorch 2.4.0
└── transformers 4.50.0

SemVer + lock file. Production stable, dev streets bleeding edge.

Production deploy:

Stages on the customer device:

SIDRA driver (kernel module, system install).
Firmware (Y1 boot ROM, once).
SDK runtime (ships with app).
Compiled model (inside the app).

OTA (Over-The-Air) updates: model + firmware update from the field.

Production scale:

The Local AI Assistant startup ships 100K devices/year. Each uses 1 SIDRA Y1 → 100K chips/year.

SIDRA workshop capacity: 100K/year (chapter 5.14). Match. Domestic production possible.

Y10 + scale:

Same product on Y10: GPT-4-class model. Phone-sized device. Volume 100K → 1M/year. Mini-fab needed (chapter 7.5).

Experiment: Full-Pipeline Performance

Local AI Assistant Y1:

Stage	Time
Microphone recording	5000 ms (user speaks)
Audio preprocessing (CPU)	50 ms
Whisper inference (SIDRA)	100 ms
Marian translation (SIDRA)	50 ms
TTS (CPU, post-process)	200 ms
Total	5400 ms

Pure inference: 150 ms (SIDRA share). Low latency.

Energy:

Inference (SIDRA): 50 + 25 = 75 mJ.
CPU + sensor: ~500 mJ (5-second active use).
Total: ~575 mJ/interaction.

Battery: 4000 mAh = 53 kJ. Interactions? 53000 / 0.575 = 92,000 interactions. 100/day → 920 days = 2.5 years battery.

(In reality: idle power dominates. Active inference energy negligible for battery life.)

Module 6 Closing Quiz

1/85 layers of the SIDRA software stack?

Driver onlyFirmware → Driver → SDK → Compiler → ApplicationLinux kernelPyTorch

Integrated Lab: Local AI Assistant Design

You’re a Turkish startup. Design a “Local AI Assistant” on SIDRA Y1.

Targets:

Turkish speech + translation.
24-hour battery.
200 TL retail.
100K/year volume.

Decisions:

(a) Which models? Sizes?

(b) How much spare Y1 capacity is enough?

(d) Version management: how to OTA update?

(e) Production CapEx + opex?

Solutions

(a) Whisper-tiny (39M) + MarianNMT TR-EN (75M) + TTS (50M) = 164M parameters = 39% of SIDRA Y1. Headroom for more models or batches.

(b) Y1 419M, model 164M → 39%. Efficient use.

(c) DVFS Idle (chapter 5.11) → 100 mW idle. Active inference 3 W but ~5% time. Average: 0.3 W. Battery 14.8 Wh / 0.3 W = 50 hours = 2 days. 24-hour target comfortable.

(d) OTA: SIDRA SDK supports firmware + model updates. 4G download 100 MB + ISPP reflash. 640 ms on Y1. End-user invisible.

(e) CapEx: SIDRA Y1 workshop investment $5M (across products). Product-specific R&D$ 500K. Opex: Y1 chip $103/unit, packaging$ 50, ARM SoC $30, battery$ 20, mic $10, case$ 30 = $243/unit. 100K sales at 200 TL → low margin. Y3 reduces price.

Conclusion: the Local AI Assistant is productizable in 2027. A concrete output of the Turkish AI ecosystem.

Module 6 Cheat Sheet

10 chapters in summary:

6.1 OS + PCIe driver basics.
6.2 aether-driver internals.
6.3 RISC-V firmware.
6.4 ISPP algorithm.
6.5 SDK layers.
6.6 PyTorch backend.
6.7 Compiler.
6.8 Digital twin.
6.9 Test/calibration.
6.10 Production-stack lab (this).

Module 6 message: the SIDRA chip’s usefulness depends on the software stack. Five layers co-designed, developed, and tested.

Vision: Hardware + Software Together

Module 6 is software. Module 7 is manufacturing: how the chip is actually made?

Cleanroom: UNAM, workshop, mini-fab.
Wafer flow: TSMC 28 nm CMOS + UNAM BEOL memristor.
Packaging: ASE, Amkor (Taiwan).
Test: SIDRA workshop.

Hardware + software co-design: SIDRA’s holistic approach. Module 7 completes the last side.

For Türkiye: software is Türkiye’s strength. Hardware infrastructure is investment-driven. Together = Türkiye AI leader.

Prerequisites

What you'll learn here

🪝 Hook: The Whole Module 6 Stack

🧭 Intuition: 9 Steps, 1 Product

📐 Formalism: End-to-End Pipeline

🧪 Experiment: Full-Pipeline Performance

📝 Module 6 Closing Quiz

🛠️ Integrated Lab: Local AI Assistant Design

🗂️ Module 6 Cheat Sheet

🔮 Vision: Hardware + Software Together

📚 Further Reading