[pull] main from commontoolsinc:main #144

pull · 2025-12-05T21:08:06Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

* perf(memory): native crypto hashing + recursive object interning Optimize merkle hashing performance through two complementary approaches: ## 1. Native Crypto Hashing Switch from @noble/hashes (pure JS) to node:crypto (native OpenSSL with SHA-NI hardware acceleration) for merkle reference computation. - Use merkle-reference's Tree.createBuilder() API with custom hash fn - Leverage built-in WeakMap caching for sub-object reuse ## 2. Recursive Object Interning Add intern() function that deduplicates objects by JSON content, enabling merkle-reference's WeakMap cache to hit on shared nested content. - Integrated into Fact.assert() and Fact.unclaimed() automatically - Uses WeakRef + FinalizationRegistry for automatic garbage collection - Recursively interns nested objects for maximum cache hits ## Performance Improvements (16KB payloads) - Shared content across facts: ~2.5x faster (286µs → 71µs) - Repeated {the, of} patterns: ~62x faster (25µs → 0.4µs) - Overall set fact: ~786µs, get fact: ~58µs, retract: ~394µs ## Files Changed - reference.ts: Add native crypto + intern() function - fact.ts: Integrate interning into assert/unclaimed/normalizeFact - HASHING.md: Document optimization journey and benchmarks - test/memory_bench.ts: Add comprehensive interning benchmarks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(memory): update bench task to use correct benchmark file The benchmark file was renamed from benchmark.ts to memory_bench.ts but the deno.json task wasn't updated to match. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * perf(memory): replace WeakRef with strong LRU cache in intern() The previous WeakRef-based intern cache had a fundamental flaw: GC would collect interned objects between refer() calls when no strong reference held them. This prevented merkle-reference's WeakMap from getting cache hits on repeated identical content. Changes: - Replace WeakRef with direct object storage in internCache Map - Add LRU eviction at 10,000 entries to bound memory - Add WeakSet for O(1) early return on already-interned objects - Remove FinalizationRegistry (no longer needed) The strong reference approach ensures interned objects stay alive long enough for refer() to benefit from merkle-reference's identity-based WeakMap cache. Benchmarks show ~2.5x speedup on shared content. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * perf(memory): add unclaimedRef() to cache unclaimed references Add a Map-based cache for unclaimed fact references. The common pattern of refer(unclaimed({the, of})) was being recomputed on every call, costing ~29µs each time. Changes: - Add unclaimedRefCache Map keyed by "${the}|${of}" - Add unclaimedRef() function that caches the full Reference - Update assert() and normalizeFact() to use unclaimedRef() Cache hits return in ~0.4µs vs ~29µs for a fresh refer() call, providing ~62x speedup for repeated {the, of} patterns. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * perf(memory): optimize refer() ordering and use cached references Multiple optimizations to reduce refer() overhead in the write path: 1. Use unclaimedRef() instead of refer(unclaimed(...)) in: - recall() for cause field - getFact() for cause field - toFact() for cause field - swap() for base reference - commit() for initial cause 2. Reorder refer() calls in swap() to maximize cache hits: - Compute fact hash BEFORE importing datum - When refer(assertion) traverses, it caches the payload hash - The subsequent refer(datum) in importDatum() hits cache (~300ns) - Previously: datum first (missed cache opportunity) - This saves ~25% on refer() time (~50-100µs per operation) 3. Intern transaction before creating commit: - Ensures all nested objects share identity - refer(assertion) caches sub-object hashes - refer(commit) hits those caches (~26% faster commits) Overall setFact improvement: ~700µs → ~664µs (5-10% faster) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs(memory): comprehensive hashing performance documentation Expanded HASHING.md with detailed findings from performance investigation: - Executive summary: 90% of refer() time is structural overhead, not hashing - How merkle-reference works internally (toTree → digest → fold) - Time breakdown showing where ~190µs actually goes for 16KB payload - Why nested transaction schema (4 levels) is expensive (~77-133µs overhead) - setFact breakdown: ~664µs total, 71% in refer() calls - Key findings: native crypto, WeakMap caching, call order, intern benefits - What didn't work: small object cache patterns - Current implementation with code examples - Optimization opportunities ranked by impact (immediate → breaking) - Realistic expectations table with potential improvements - Architecture notes on why content-addressing requires this overhead This document serves as a reference for future optimization work and explains why we're approaching the fundamental floor for content-addressed storage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(memory): use conditional crypto for browser compatibility - Browser: use merkle-reference's default refer() (uses @noble/hashes internally) - Server: upgrade to node:crypto TreeBuilder for ~1.5-2x speedup - Dynamic import prevents bundlers from resolving node:crypto in browser builds - Fixes CORS/module resolution error when shell tries to import node:crypto 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * perf(memory): add SQLite pragmas for WAL mode and performance - journal_mode=WAL: Better concurrency, faster writes - synchronous=NORMAL: Safe for WAL, improved write performance - busy_timeout=5000: Wait on locks instead of failing - page_size=32768: 32KB pages for new databases - cache_size=-64000: ~64MB in-memory page cache - temp_store=MEMORY: Keep temp tables in RAM - mmap_size=268435456: 256MB memory-mapped I/O - foreign_keys=ON: Enforce referential integrity Benchmarks show ~3x faster single writes/updates and ~1.5x faster reads on file-based databases. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test(memory): add file-based benchmarks for pragma testing Add file-based benchmark group to measure real WAL mode and pragma impact on disk I/O. Memory-based benchmarks don't exercise WAL/mmap. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(memory): address PR review comments - Fix LRU cache recency bug in intern(): now properly moves accessed entries to end of Map via delete+re-insert - Replace custom isBrowser detection with isDeno() from @commontools/utils/env - Fix type error in unclaimedRef by using Ref.View<Unclaimed> to match what refer() returns 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(memory): update benchmark to use unclaimedRef The benchmark was testing the old refer() pattern directly, but the PR changed the caching strategy to use unclaimedRef() for unclaimed facts. Update the benchmark to test the actual production code path. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(deps): restore deno.lock from main to fix OpenTelemetry types The lock file had inadvertently downgraded @opentelemetry/sdk-trace-base from 1.30.1 to 1.19.0, which doesn't support the spanProcessors constructor option used in otel.ts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(memory): remove unused import and fix formatting - Remove unused `unclaimed` import from space.ts - Apply deno fmt to HASHING.md and space.ts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>

…t pattern, syncing state (#2188) chore: shell: Separate the derivation of active pattern and space root pattern, synchronizing the UI state Rewrite some pattern integration tests to be less dependent on timing issues, surfaced by this change. instantiate-recipe.test.ts has been disabled for now.

Revert "chore: shell: Separate the derivation of active pattern and space root pattern, syncing state (#2188)" This reverts commit 6e979d0.

willkelly and others added 3 commits December 5, 2025 12:28

Revert 6e979d0 (#2208)

5f7f106

Revert "chore: shell: Separate the derivation of active pattern and space root pattern, syncing state (#2188)" This reverts commit 6e979d0.

pull bot locked and limited conversation to collaborators Dec 5, 2025

pull bot added the ⤵️ pull label Dec 5, 2025

pull bot merged commit 5f7f106 into ExaDev:main Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from commontoolsinc:main #144

[pull] main from commontoolsinc:main #144

Uh oh!

pull bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[pull] main from commontoolsinc:main #144

[pull] main from commontoolsinc:main #144

Uh oh!

Conversation

pull bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull bot commented Dec 5, 2025 •

edited

Loading