SDK Layers
SIDRA Software Development Kit — the developer's interface.
Prerequisites
What you'll learn here
- Name the SDK's three layers (low-level C, high-level C++, Python)
- Read the API design principles and examples
- Summarize PyTorch and TensorFlow plugin structure
- Describe versioning, ABI stability, and documentation strategies
- Identify the developer ecosystem (sample apps, demos)
Hook: Driver Direct Use Is Painful
Driver IOCTLs are low-level C, error-prone. The SDK wraps them:
- Low-level C API: one function per IOCTL.
- High-level C++ API: Model, Tensor, Inference classes.
- Python API:
import sidraand you’re done.
Developers use Python ~95% of the time. The SDK handles the rest underneath.
Intuition: 3 Layers of Abstraction
[App: PyTorch model + a few calls]
↓
[Python sidra package]
↓
[C++ libsidra (SDK)]
↓
[C aether-driver (kernel)]
↓
[Hardware]Each layer abstracts the next. The developer lives at the top; the inside complexity is invisible.
Formalism: SDK API Design
Python API (most common):
import sidra
import numpy as np
# Open the chip
chip = sidra.Chip(device_id=0)
print(f"SIDRA {chip.version}, {chip.memristor_count}M memristors")
# Load a model (from PyTorch state_dict)
import torch
model_pt = torch.load("my_model.pt")
model = sidra.Model.from_pytorch(model_pt)
chip.load_model(model)
# Inference
input_tensor = np.random.randn(1, 784).astype(np.float32)
output = chip.infer(input_tensor)
print(output.shape, output) # (1, 10) MNIST classesC++ API (performance-critical):
#include <sidra/sidra.h>
int main() {
sidra::Chip chip(0);
sidra::Model model = sidra::Model::load("model.bin");
chip.load_model(model);
std::vector<float> input(784);
auto output = chip.infer(input);
return 0;
}C low-level (driver direct):
int fd = open("/dev/sidra0", O_RDWR);
struct inference_req req = {.input = ..., .output = ...};
ioctl(fd, SIDRA_IOCTL_INFER, &req); Tensor class:
class Tensor {
DType dtype; // FP32, INT8, INT4
Shape shape; // dimensions
DeviceMem data; // in SIDRA memory
Tensor reshape(Shape new_shape);
Tensor quantize(DType target);
Tensor copy_to_host();
};Similar to a CUDA tensor on GPUs. Data lives in GPU/SIDRA memory; copy to host explicit.
Model class:
class Model {
std::vector<Layer> layers;
Quantization q;
static Model from_pytorch(const PyTorchModel&);
static Model from_onnx(const std::string& path);
void quantize(QuantConfig cfg); // FP32 → INT8
void compile_for(Chip& chip); // crossbar mapping
};ONNX (Open Neural Network Exchange) is the standard, framework-agnostic format.
Inference workflow:
chip = sidra.Chip(0)
# One-time
model = sidra.Model.from_onnx("yolov8.onnx")
model.quantize(sidra.INT8)
model.compile_for(chip)
chip.load_model(model)
# Repeated
for image in stream:
output = chip.infer(image)
process(output) ABI stability:
C++ ABI can shift → app rebuild required at every SDK upgrade. Solution:
- C ABI fixed (libsidra_c.so).
- C++ wrapper inline (header-only).
- Python pure-Python wrapper.
Strategy: minimal C ABI + rich Python. C++ in between.
Versioning:
libsidra.so.1.0.0. SemVer:
- Major (1.x): breaking changes.
- Minor (1.1): new features, backward-compatible.
- Patch (1.0.1): bug fixes.
Apps link to libsidra.so.1 → 1.x updates accepted automatically.
Documentation:
- API reference (Doxygen + Sphinx).
- Tutorial notebooks (Jupyter).
- Sample apps on GitHub.
- Stack Overflow tag.
PyTorch / TensorFlow plugin:
# PyTorch
import torch
import sidra
# Register SIDRA backend
torch.backends.sidra.enabled = True
# A normal PyTorch model runs on SIDRA
model = torch.load("model.pt").to("sidra:0")
output = model(input) # transparent SIDRA inferenceLike NVIDIA’s CUDA backend. Custom backend register protocol PyTorch 2.0+.
TensorFlow XLA:
XLA (Accelerated Linear Algebra) backend. tf.config.set_visible_devices(["sidra:0"]). Limited scope, in development.
SDK size:
- libsidra.so: ~5 MB.
- Python sidra package: ~10 MB (compiled extension included).
- Headers: ~500 KB.
Minimal impact on app size.
Sample apps:
sidra-examples/ repo:
mnist_classifier.py(101 lines).imagenet_resnet50.py.bert_qa.py.speech_whisper.py.gpt2_chat.py.
Developer’s starting point.
Experiment: 10-Line MNIST Inference
import sidra
import torch
import torchvision
# Load model (PyTorch standard)
model = torchvision.models.simple_mlp()
model.load_state_dict(torch.load("mnist_mlp.pth"))
# Move to SIDRA
chip = sidra.Chip(0)
sidra_model = sidra.Model.from_pytorch(model).quantize(sidra.INT8)
chip.load_model(sidra_model)
# Inference
import torchvision.transforms as T
mnist_test = torchvision.datasets.MNIST("/data", train=False, transform=T.ToTensor())
img, label = mnist_test[0]
prediction = chip.infer(img.numpy().reshape(1, 784))
print(f"Predicted: {prediction.argmax()}, True: {label}") # 7, 710 lines. SDK underneath:
- Parse PyTorch state_dict.
- Build model graph.
- INT8 quantize.
- Crossbar mapping.
- Program with ISPP.
- Driver IOCTL inference.
Time:
- Coding + setup: 5 minutes.
- Model load (640 ms ISPP): 1 second.
- Inference: 25 µs/MNIST.
Quick Quiz
Lab Exercise
SIDRA SDK developer-training materials plan.
Audiences:
- Academic researcher (Python, PyTorch).
- Production engineer (C++, performance).
- Embedded developer (C, kernel).
Resources (proposed):
- Quick start (15 min): Python + MNIST.
- Tutorial notebook series (10 steps).
- C++ API guide.
- Performance optimization guide.
- Sample apps GitHub.
Community:
- Discord/Slack Q&A.
- Monthly webinar.
- Annual conference in Türkiye.
Cheat Sheet
- SDK 3 layers: C low-level, C++ high-level, Python.
- Python easiest: 10-line MNIST.
- ONNX import: framework-agnostic.
- PyTorch backend: model.to(‘sidra:0’).
- ABI stability: C fixed, C++ wrapped.
- SDK size: ~10 MB.
- Sample apps: GitHub repo.
Vision: SIDRA Developer Ecosystem
- Y1: Python + PyTorch support.
- Y3: TensorFlow + JAX backend.
- Y10: Mobile SDK (Android/iOS).
- Y100: SIDRA-native AI framework.
- Y1000: Quantum-AI hybrid SDK.
For Türkiye: SDK with Turkish documentation, courses at Turkish universities, broad academic adoption.
Further Reading
- Next chapter: 6.6 — Writing a PyTorch Backend
- Previous: 6.4 — ISPP Algorithm
- PyTorch backend: pytorch.org “PrivateUse1 backend”.
- ONNX: onnx.ai documentation.
- API design: Bloch, Effective Java (general API design).