worb architecture

A single executable, a single SQLite file, zero external dependencies.

Layers

Ingestion layer
- the file_stream endpoint accepts metrics from the wandb Python client
- each request carries JSON with scalars, histograms, events, and console logs for a run
- data is parsed and passed to the write-ahead log without touching SQLite on the hot path
Write-Ahead Log (WAL)
- an application-level WAL that buffers incoming rows in an append-only file on disk
- a background goroutine drains the WAL in batches and writes to SQLite
- decouples ingestion throughput from database write speed
- allows the HTTP handler to return immediately, keeping the wandb client happy
Store
- wraps a single SQLite database (or DuckDB, if configured)
- tables: projects, runs, run_steps, history, events, console_logs, files, artifacts
- history rows store each metric key/value as a separate row for efficient per-key queries
- batch inserts with transactions for throughput, read queries use streaming to keep memory flat
API layer
- a chi router serves three API surfaces: wandb-compatible GraphQL, a REST API for the dashboard, and the file_stream protocol
- GraphQL resolves viewer, project, and run queries so wandb.init() and wandb.finish() work as expected
- REST endpoints stream history as newline-delimited JSON for chart rendering
Dashboard
- an embedded SPA (the ui/ package) compiled into the binary via go:embed
- renders metric charts, histograms, run tables, console logs, and a SQL console
- no Node.js build step, no npm, just static assets baked into the Go binary

Everything runs in one process. There is no message queue, no cache layer, no container orchestration. The WAL file and the SQLite database are the only state on disk.

The WAL

The wandb client streams metrics over HTTP at high frequency. Writing each row directly to SQLite would serialize every request behind a database lock, so worb interposes an application-level write-ahead log.

Incoming data is appended to a single WAL file on disk (~/.worb/wal.jsonl). A background goroutine wakes up every 500 ms (or when a batch reaches 50,000 items), reads a chunk from the WAL, and flushes it to SQLite in a single transaction. The WAL file is compacted once it exceeds 1 GB.

This design means the HTTP handler only does an append() to a file and returns. SQLite writes happen in the background, batched and ordered, with no contention on the ingestion path.

Storage

All experiment data lives in a single SQLite database file at ~/.worb/worb.db. Backups are a file copy. There is no export tool, no dump command, no migration to run. You can also point worb at a Turso database for hosted/replicated SQLite, or DuckDB if you prefer columnar storage.

The schema is designed around the wandb data model: projects contain runs, runs contain history rows. History is stored as one row per metric per step (run_id, step, key, value) rather than one JSON blob per step. This makes per-key range queries fast and avoids parsing large JSON objects on read.

Uploaded files (model checkpoints, artifacts) go to a flat directory on disk (~/.worb/files/) with metadata tracked in the files table. The file store is a simple content-addressed layout, no object storage dependency.

Compatibility

worb implements enough of the wandb server API that the standard wandb Python client works without patches. You set WANDB_BASE_URL to your worb instance, and wandb.init(), wandb.log(), and wandb.finish() work as expected.

The GraphQL endpoint handles the handshake and metadata queries the client makes on init and finish. The file_stream endpoint accepts the chunked metric uploads. A built-in GraphQL playground is available at /playground for exploration.

The SQL console in the dashboard lets you query the underlying SQLite database directly, which is useful for ad-hoc analysis that goes beyond what the charts offer.

GitHub