Pure iii-sdk worker for json-render UI generation — JSONL patch streaming, caching, rate limiting, and validation. No standalone HTTP server; all endpoints are iii functions with HTTP triggers served by the iii engine.
Browser ──POST──> iii-engine (:3111) ──> spec-forge worker (Rust) ──> Claude API
│ │
│ JSONL patches (RFC 6902)
│ │
Browser <──WebSocket── iii Channel (:49134) <─────┘
│
└──> Progressive render (each patch = 1 component)
json-render calls the LLM on every request. No caching, no streaming, no rate limiting. API key exposed to client.
spec-forge is a pure iii-sdk worker that streams JSONL patches (RFC 6902) through iii Channels — each patch arrives at the browser via WebSocket the moment Claude generates it, so the UI fills in progressively.
| json-render | spec-forge + iii | |
|---|---|---|
| Architecture | Client-side LLM calls | iii worker with HTTP triggers |
| Output format | Full JSON response | JSONL patches (RFC 6902) |
| Streaming | Vercel AI SDK streamText |
iii Channels (WebSocket) |
| First paint | After full LLM response | After first patch (~200ms) |
| Cache | None | SHA-256 exact + TF-IDF semantic |
| Repeat request | 3-5s LLM call | 0ms cached |
| Rate limiting | None | Token bucket + concurrency semaphore |
| API key | Client-side | Server-side only |
| Observability | None | OpenTelemetry (built-in via iii) |
All numbers measured on Apple M-series. Reproduce: ./bench/run.sh
Both frameworks running the same operations in Node.js — no language advantage, pure algorithm comparison:
| Operation | json-render | spec-forge | Winner | Speedup |
|---|---|---|---|---|
| JSONL 3 patches (bulk) | 4.83 µs | 3.13 µs | spec-forge | 1.5x |
| JSONL 9 patches (chunked) | 16.77 µs | 10.11 µs | spec-forge | 1.7x |
| Prompt build (minimal) | 3.02 µs | 1.42 µs | spec-forge | 2.1x |
| Prompt build (dashboard) | 6.10 µs | 3.48 µs | spec-forge | 1.8x |
| Pipeline 9 elements | 15.90 µs | 11.94 µs | spec-forge | 1.3x |
| Validate 50 elements | 9.77 µs | 9.82 µs | tie | 1.0x |
| Parse 500 elements | 328 µs | 326 µs | tie | 1.0x |
spec-forge's JSONL parser is leaner — no dedup Set, no object spread on every batch. json-render's createSpecStreamCompiler buffers text, deduplicates lines, and copies the result object on every patch.
The Rust worker is where the real gap opens. These are actual production code paths with std::hint::black_box:
| Operation | Rust | JavaScript | Speedup |
|---|---|---|---|
| Stringify 3 elements | 0.56 µs | 1.12 µs | 2.0x |
| Stringify 500 elements | 63.6 µs | 141.5 µs | 2.2x |
| Stringify 2000 elements | 253 µs | 618 µs | 2.4x |
| Validate 500 elements | 74.0 µs | 112.2 µs | 1.5x |
| Validate 2000 elements | 312 µs | 586 µs | 1.9x |
| Parse 2000 elements | 1,062 µs | 1,497 µs | 1.4x |
json-render has zero caching. Every request is a fresh LLM call (3-5 seconds).
| Operation | spec-forge | json-render |
|---|---|---|
| SHA-256 exact cache hit | 0.04 µs | N/A (no cache) |
| TF-IDF semantic hit (10 entries) | 13.7 µs | N/A |
| TF-IDF semantic hit (100 entries) | 905 µs | N/A |
| Rate limiter acquire + release | 0.07 µs | N/A |
On repeat requests, spec-forge returns in < 1 µs (SHA-256 cache hit). json-render has no cache and re-invokes Claude every time: 3,000,000 - 5,000,000 µs (3-5 seconds).
In practice this means repeat-request latency drops from 3-5 seconds to sub-microsecond — the user gets an instant response instead of waiting for a full LLM round-trip.
The TF-IDF semantic cache catches similar prompts too — "A revenue dashboard" matches "A sales dashboard with revenue metrics" at the 0.85 cosine threshold without burning an LLM call.
When spec-forge does call the LLM (cache miss), the total overhead from caching + validation + storage:
| Stage | Time |
|---|---|
| SHA-256 cache key | 2.65 µs |
| Exact cache lookup (miss) | 0.01 µs |
| TF-IDF semantic check (100 entries) | 905 µs |
| Rate limiter acquire | 0.07 µs |
| JSONL patch parsing (9 patches) | 8.27 µs |
| Spec validation (9 elements) | 0.90 µs |
| Cache store | ~1 µs |
| Total overhead | < 1 ms |
Negligible against the 2-5 second LLM call. You get caching, rate limiting, validation, and observability essentially for free.
./bench/run.sh # Everything (Rust + JS comparison + e2e)
./bench/run.sh rust # Rust native only (cargo build --release)
./bench/run.sh compare # Side-by-side JS comparison table
./bench/run.sh e2e # End-to-end through iii engine (server must be running)See bench/README.md for full methodology and all test categories.
spec-forge runs on the iii engine. Install it first:
curl -fsSL https://install.iii.dev/iii/main/install.sh | shThis installs the iii CLI to ~/.local/bin/iii. Make sure ~/.local/bin is in your PATH.
Verify the installation:
iii --versionspec-forge is a Rust worker. Install Rust if you don't have it:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh# 1. Clone and enter
git clone https://github.com/iii-hq/spec-forge.git
cd spec-forge
# 2. Set your Anthropic API key (in .env or environment)
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
# 3. Start iii engine
iii --config iii-config.yaml &
# 4. Build and run the worker
cargo build --release
./target/release/spec-forge &
# 5. Serve demo UI
cd demo && python3 -m http.server 3112 &Open http://localhost:3112 for the interactive playground.
All endpoints are iii functions with HTTP triggers, served by the engine on port 3111:
| Route | Method | Description |
|---|---|---|
/spec-forge/generate |
POST | Generate spec (cache -> semantic -> Claude -> validate) |
/spec-forge/stream |
POST | Stream patches via iii Channel (WebSocket) — real-time progressive rendering |
/spec-forge/refine |
POST | Patch existing spec with incremental changes |
/spec-forge/validate |
POST | Validate spec against component catalog |
/spec-forge/prompt |
POST | Preview the LLM prompt that would be sent |
/spec-forge/stats |
GET | Rate limiter, cache, and stream metrics |
/spec-forge/health |
GET | Liveness check |
Claude outputs one JSON patch operation per line. Each line is independently parseable, enabling real-time streaming:
{"op":"add","path":"/root","value":"main"}
{"op":"add","path":"/elements/main","value":{"type":"Card","props":{"title":"Dashboard"},"children":["metric-1","chart"]}}
{"op":"add","path":"/elements/metric-1","value":{"type":"Metric","props":{"label":"Revenue","value":"$42K"},"children":[]}}
{"op":"add","path":"/elements/chart","value":{"type":"Chart","props":{"title":"Sales"},"children":[]}}Operations: add, replace, remove on paths /root and /elements/{key}.
The /stream endpoint creates an iii Channel (WebSocket-backed pipe) and returns reader credentials:
curl -X POST http://localhost:3111/spec-forge/stream \
-H "Content-Type: application/json" \
-d '{"prompt": "A sales dashboard", "catalog": {...}}'Response:
{
"cached": false,
"channel": {
"channel_id": "ch_abc123",
"access_key": "key_xyz"
}
}Connect to the WebSocket to receive patches in real-time:
ws://localhost:49134/ws/channels/{channel_id}?key={access_key}&dir=read
Messages arrive as:
{"type": "patch", "patch": {"op": "add", "path": "/elements/metric-1", "value": {...}}}
{"type": "done", "spec": {...}, "valid": true, "generation_ms": 1823}Each patch message triggers a progressive UI render. The browser never waits for the full LLM response.
curl -X POST http://localhost:3111/spec-forge/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": "A login form with email and password",
"catalog": {
"components": {
"Stack": {"description": "Flex container", "props": {"direction": "vertical|horizontal", "gap": "number"}, "children": true},
"Card": {"description": "Container card", "props": {"title": "string"}, "children": true},
"Input": {"description": "Text input", "props": {"placeholder": "string", "type": "string", "label": "string"}},
"Button": {"description": "Button", "props": {"label": "string", "variant": "primary|secondary"}}
},
"actions": {"submit": {"description": "Submit form"}}
}
}'Response includes both the final spec and the JSONL patches used to build it:
{
"spec": {
"root": "form-card",
"elements": {
"form-card": {"type": "Card", "props": {"title": "Login"}, "children": ["form-stack"]},
"email-input": {"type": "Input", "props": {"placeholder": "you@example.com", "type": "email", "label": "Email"}, "children": []}
}
},
"patches": [
{"op": "add", "path": "/root", "value": "form-card"},
{"op": "add", "path": "/elements/form-card", "value": {"type": "Card", "props": {"title": "Login"}, "children": ["form-stack"]}}
],
"cached": false,
"generation_ms": 2841,
"model": "claude-sonnet-4-6"
}Second request with same or similar prompt: "cached": true, "generation_ms": 0.
Send a change request with JSONL patches instead of regenerating from scratch:
curl -X POST http://localhost:3111/spec-forge/refine \
-H "Content-Type: application/json" \
-d '{
"prompt": "Add a forgot password link",
"current_spec": {"root": "form-card", "elements": {...}},
"catalog": {...}
}'The demo/index.html playground connects to the iii engine and provides:
- JSON tab — syntax-highlighted spec output with JSONL patches
- STREAM tab — real-time log of WebSocket patch messages
- LIVE RENDER — progressive rendering as patches arrive via WebSocket
- STATIC CODE — copy-pasteable React code using
@anthropic-ai/json-render-react - Catalog drawer — 4 presets (dashboard, form, ecommerce, minimal) with editable JSON
- Refine — iteratively modify existing specs without full regeneration
| Preset | Components |
|---|---|
| Dashboard | Stack, Card, Grid, Heading, Metric, Table, Button, Text, Badge, Divider, Input |
| Form | Stack, Card, Heading, Input, Textarea, Select, Checkbox, Radio, Button, Text, Divider, Badge |
| Ecommerce | Stack, Grid, Card, Heading, Image, Text, Button, Metric, Badge, Divider, List |
| Minimal | Stack, Card, Heading, Text, Button |
spec-forge is a pure iii-sdk worker — no Axum, no standalone HTTP server. All endpoints are registered as iii functions with HTTP triggers.
iii-engine (iii-config.yaml)
├── RestApiModule (port 3111, CORS)
├── StateModule (file-based KV)
├── OtelModule (traces, metrics, logs)
├── PubSubModule (local)
├── CronModule
└── Channels (WebSocket port 49134)
spec-forge worker (connects via WebSocket)
├── register_functions() 7 iii functions
├── register_http_triggers() 7 HTTP trigger bindings
└── business logic
├── generate_core() cache → semantic → rate limit → Claude → validate → store
├── stream_core() iii Channel → Claude streaming → JSONL patches → WebSocket
├── refine_core() JSONL patch-based refinement (Add/Replace/Remove)
├── validate_core() spec validation against catalog
├── prompt_core() preview LLM prompt
├── stats_core() metrics from iii Streams
└── health_core() liveness check
src/
├── main.rs # iii worker: SharedState, 7 functions, 7 HTTP triggers, core logic
├── types.rs # GenerateRequest, Catalog, ComponentDef, ActionDef, UISpec, UIElement
├── cache.rs # SHA-256 exact cache with TTL (DashMap)
├── semantic.rs # TF-IDF cosine similarity cache for fuzzy prompt matching
├── limiter.rs # Rate limiter (token bucket + concurrency semaphore)
├── validate.rs # Spec validation against component catalog
├── prompt.rs # LLM prompt builder (JSONL/RFC 6902 format)
└── bench.rs # Benchmark binary
demo/
├── index.html # Self-contained playground with WebSocket streaming
└── demo.gif # Demo recording
client/
├── src/index.ts # JS/TS client SDK
└── src/json-render-adapter.ts # Adapter for json-render <Render>
| Primitive | Usage |
|---|---|
III::init() |
Initialize worker, connect to engine via WebSocket |
register_function_with_description() |
Register 7 named functions with descriptions |
register_trigger("http", ...) |
Bind functions to HTTP routes (api_path + http_method) |
ApiRequest<T> / ApiResponse<T> |
Typed HTTP request/response wrappers |
create_channel() + ChannelWriter |
WebSocket-backed streaming pipe for real-time patches |
Streams |
Atomic metric counters (cache hits/misses, generation count/ms) |
get_context().logger |
Structured logging (info, warn, error) |
| OtelModule | Automatic OpenTelemetry traces, metrics, and logs |
cargo test27 tests: cache (4), semantic (7), limiter (5), validate (6), prompt (5).
| Env Var | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
required | Claude API key |
DOTENV_PATH |
.env |
Path to .env file |
| Module | Config |
|---|---|
| RestApiModule | Port 3111, CORS for localhost:3112/3111/3000/5173 |
| StateModule | File-based KV at ./data/state_store.db |
| OtelModule | Memory exporter, all signals enabled |
| PubSubModule | Local adapter |
| CronModule | KV-backed cron |
Rate limit: 60 req/min + 5 concurrent. Cache TTL: 300s. Semantic threshold: 0.85. Default model: claude-sonnet-4-6.
MIT
