[STILL IN DRAFT] feat: add /kv-cache route with interactive KV cache explainer by sengopal · Pull Request #77 · poloclub/transformer-explainer

sengopal · 2026-03-27T20:23:26Z

Summary

Closes #63

Adds a new /kv-cache route (poloclub.github.io/transformer-explainer/kv-cache) that visualizes the KV cache mechanism interactively, leaving the original root page completely untouched.

Prefill phase: shows the existing N×N attention matrix for the selected prompt (same visualization as root page)
Decode phase: step-through controls (← Prev / Next →) reveal:
- A growing KV cache table showing each token's cached Key vector (red) and Value vector (green)
- A 1×N attention strip showing the current decode token attending to all cached context
Head selector and example selector from the Topbar remain active throughout

Implementation

File	Change
`src/routes/kv-cache/+page.svelte`	New route — same layout as root, adds decode controls
`src/routes/kv-cache/+page.ts`	`export const prerender = true`
`src/store/kvcache.ts`	New store: `decodeStep`, `kvCache`, `isDecoding`, `currentDecodeData`, `promptTokenCount`
`src/components/KVCacheTable.svelte`	Growing table of cached token K/V vectors using VectorCanvas
`src/components/AttentionMatrix.svelte`	When `isDecoding=true`: shows 1×N strip + KVCacheTable; root page unaffected
`src/constants/examples/kv/ex{0-4}.js`	Pre-computed KV cache + attention data for 5 prompts × 5 decode steps
`scripts/generate_kv_examples.py`	Offline script using HuggingFace GPT-2 to regenerate example data
`svelte.config.js`	Added `/kv-cache` to prerender entries
`vite.config.ts`	Fixed SCSS `additionalData` absolute path for Vite 6 compatibility
`package.json`	Upgraded `vite` 5→6 to match `@sveltejs/vite-plugin-svelte@6` peer dep

Test plan

Navigate to /kv-cache — prefill animation plays for default example
Click Next → — KV cache table grows by one row, attention strip updates
Click ← Prev — reverts to previous step; at step 0, returns to prefill view
Change example in Topbar — new prefill runs, decode controls reset
Change attention head — KV vectors update to selected head
Root page (/) behavior unchanged — no KV cache UI shown

🤖 Generated with Claude Code

Design for /kv-cache route implementing interactive KV cache visualization per issue poloclub#63. Covers routing, components, data model, and state management. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Step-by-step plan for /kv-cache route: data generation script, store, KVCacheTable component, AttentionMatrix decode mode, and kv-cache SvelteKit route. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…amples Offline Python script (scripts/generate_kv_examples.py) that runs GPT-2 on 5 prompts and extracts KV cache snapshots + attention scores per decode step. Outputs 5 JS modules to src/constants/examples/kv/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

New route at /kv-cache that visualizes the KV cache mechanism in GPT-2. - Prefill phase: shows existing N×N attention matrix (unchanged flow) - Decode phase: step-through controls (← Prev / Next →) reveal a growing KV cache table (K=red, V=green via VectorCanvas) per token + 1×N attention strip for the current decode token - Decode data driven by pre-computed examples (scripts/generate_kv_examples.py) with 5 prompts × 5 decode steps each; logits stripped to keep files ~1.3 MB - New store (src/store/kvcache.ts): decodeStep, kvCache, isDecoding, currentDecodeData, promptTokenCount - AttentionMatrix.svelte: when isDecoding=true shows 1×N strip + KVCacheTable instead of N×N; root page unaffected (isDecoding defaults false) - vite upgraded from 5 → 6 to match @sveltejs/vite-plugin-svelte@6 peer dep; vite.config.ts SCSS additionalData updated to use absolute path for Vite 6 - Build verified: vite build succeeds, build/kv-cache.html generated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Attention strip (Issue 1): normalize color scale to [0, max_score] so circles are visible even with low-entropy uniform distributions - QKV/MLP show all tokens (Issues 2 & 3): set $tokens to [inputToken] during decode so Embedding/QKV/MLP columns show only the new token - LinearSoftmax not updating (Issue 4): set predictedToken to the next decode step's inputToken so the prediction panel updates each step - Restore full prompt tokens when navigating back to prefill (step 0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix contenteditable not updating from store: add afterUpdate in InputForm to sync inputRef.innerText when not focused (safe for root page) - Add topLogits (top-50 [tokenId, logit]) per decode step in Python script and regenerate examples; decode steps now ~1.35MB each (vs 1.3MB before) - Update probabilities panel per decode step using reconstructed sparse logits with user's temperature/sampling applied; greedy decode token is highlighted - Fix temperature/sampling subscribers in decode mode to re-run distribution from current step's topLogits instead of stale prefill logits - Accumulate decoded tokens in text box (prefillText + decoded so far) with ignoreInputTextChange guard to prevent re-triggering prefill - Hide MLP token labels during decode mode via .decode-mode CSS class - Override predictedToken after prefill to show greedy first decode token Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- KVCacheTable: transpose from rows to columns — tokens as columns with rotated labels, K/V row headers on left; eliminates vertical scroll - AttentionMatrix: add click-to-expand for decode mode using separate decodeExpanded state (independent of prefill's isAttentionExpanded/ expandAttention to avoid animating missing DOM elements) - Decode expanded view shows Softmax 1×N circles + full KV cache table; outside-click closes via decodeExpandableEl bound ref - Attention.svelte: elevate .head-title z-index when expanded so head nav buttons are clickable above dim overlay - Fix attentionOutputs empty in decode: use attn_implementation='eager' in generate script; default impl silently returns empty tuple with past_key_values; regenerate all 5 examples (~1.5MB each) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Derive dot-product and scaling·mask stages from softmax via log-space inversion (no data regeneration needed) - Expanded decode view shows Dot Product → Scaling·Mask → Softmax panels horizontally (nowrap, overflow visible) mirroring prefill layout - "Out" button opens inline modal with real attention×value=out computation using kvCache values for the current head - min-height on decode container matches prefill headContentHeight Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…label, add animations - Normalize value vectors globally (shared min/max) so token differences are visible; normalize Out vector with its own min/max range - Fix black strip in Out vector by clamping height to 64px (data length) - Increase value strip height 12px → 32px for more visible dimension patterns - Restyle Out trigger to match prefill column label (purple, vertical, tooltip) - Add fly/fade transitions on expand/collapse and Out modal open/close Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add 6 inference-oriented textbook pages to the /kv-cache route covering What is Inference, Prefill Phase, The KV Cache, Decode Phase, Attention During Decode, and Autoregressive Loop. Each page includes an illustrative image and proper citation from Lages (Medium), Verma & Vaidya (NVIDIA), and Not Lain (Hugging Face). Wire up Textbook component on kv-cache page, set initial page to kv-inference synchronously, and move <Textbook> outside .main-section so the floating button is not hidden by the opacity fade-in. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sengopal and others added 4 commits March 27, 2026 11:41

Add KV cache explainer design spec

d7107cf

Design for /kv-cache route implementing interactive KV cache visualization per issue poloclub#63. Covers routing, components, data model, and state management. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add KV cache explainer implementation plan

f7b1ecd

Step-by-step plan for /kv-cache route: data generation script, store, KVCacheTable component, AttentionMatrix decode mode, and kv-cache SvelteKit route. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sengopal changed the title ~~feat: add /kv-cache route with interactive KV cache explainer~~ [STILL IN DRAFT] feat: add /kv-cache route with interactive KV cache explainer Mar 27, 2026

sengopal and others added 6 commits March 27, 2026 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[STILL IN DRAFT] feat: add /kv-cache route with interactive KV cache explainer#77

[STILL IN DRAFT] feat: add /kv-cache route with interactive KV cache explainer#77
sengopal wants to merge 10 commits intopoloclub:mainfrom
sengopal:kv-cache-explainer

sengopal commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sengopal commented Mar 27, 2026

Summary

Implementation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant