Skip to content

Refactor bundled transforms and add search surfaces#90

Merged
marklubin merged 6 commits intomainfrom
mark/search-surface-transform-boundaries
Mar 6, 2026
Merged

Refactor bundled transforms and add search surfaces#90
marklubin merged 6 commits intomainfrom
mark/search-surface-transform-boundaries

Conversation

@marklubin
Copy link
Owner

Summary

  • move the generic synthesis primitives to synix.transforms and keep the bundled memory-oriented transforms in synix.ext, with compatibility aliases during migration
  • introduce SearchSurface plus uses=[...] for build-time retrieval, and surface-aware planner/runner/CLI handling
  • update docs, templates, tests, and demo goldens to the new transform/search model

Verification

  • git diff --check
  • uv run python -c "from synix.dev import verify_demos; verify_demos()"
  • uv run pytest

Results

  • demo smoke: All 5 demos passed.
  • test suite: 1472 passed in 122.85s

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Red Team Review — OpenAI GPT-5.2 | Adversarial review (docs + diff only)

Threat assessment — Medium risk: this PR introduces new public concepts (SearchSurface, uses) and changes module boundaries; it’s directionally good but it contains a couple of correctness/ergonomics traps that will be painful to unwind.

One-way doors

  1. SearchSurface as a public API + Pipeline.surfaces registry

    • Why hard to reverse: users will write pipelines with SearchSurface(...) and expect stable semantics (materialization timing, local path, caching, search modes). You’ll be stuck supporting it or writing migrations.
    • Safe if: semantics are nailed down (is it a capability, a build artifact, or a projection?) and you have a compatibility story for existing SearchIndex pipelines + stable CLI behavior.
  2. Transform.uses: list[Layer] + fingerprinting it

    • Why hard to reverse: you’ve made uses part of cache keys (components["uses"]). Any later change to what qualifies as “use” or how it’s ordered breaks caching determinism and causes rebuild storms.
    • Safe if: uses is restricted to capability layers (e.g., SearchSurface only), validated at pipeline load time, and fingerprint includes enough identity to avoid accidental cache hits (see findings).
  3. Filesystem contract: build/surfaces/<name>.db

    • Why hard to reverse: tools/tests/templates will come to rely on this path. You’re codifying a layout while simultaneously saying it’s “capability”.
    • Safe if: it’s explicitly documented as compatibility local realization, not canonical; and you can change it behind a stable API (context handle) later.
  4. Module boundary reshuffle: generic transforms in synix.transforms, memory transforms in synix.ext

    • Why hard to reverse: import paths become part of user code and docs; you already added shims and re-exports which signals “we’ll support both”.
    • Safe if: you publish an explicit deprecation timeline and ensure both import paths work for at least one minor line.

Findings

  1. src/synix/build/runner.py: progressive surface materialization is logically wrong

    • Pattern: _materialize_layer_search_surfaces() calls _materialize_projection(... source_layer_override=available_names) whenever any source layer becomes available.
    • Failure mode: you are building a “partial” surface (only some source layers) into the final db path build/surfaces/<name>.db, but _refresh_surface_cache() later records the cache metadata for the full closure. Result: cache claims surface is fresh even if the DB contains a partial index depending on build order and failures. This is a correctness bug, not just a perf issue.
    • Severity: [critical]
  2. src/synix/build/plan.py vs runner.py: inconsistent planning semantics

    • Pattern: planner always plans surfaces over surface.sources (full set). runner may progressively materialize partial surfaces.
    • Failure mode: synix plan will report “cached/new” based on full closure hashes while runtime may use a partial surface during the same build for TopicalRollup. Non-obvious mismatch and hard-to-debug behavior.
    • Severity: [warning]
  3. src/synix/build/llm_transforms.py: _open_search_surface_index() relies on internal/private API + swallows all exceptions

    • Pattern: idx._get_conn() + broad except Exception: return None.
    • Failure mode: any schema mismatch or sqlite error silently disables retrieval and causes TopicalRollup to fall back to “use all episodes”, changing outputs without warning. This will look like “semantic search works sometimes”.
    • Severity: [warning]
  4. src/synix/core/models.py: SearchIndex inherits SearchSurface but Pipeline.add() classification is brittle

    • Pattern: if isinstance(layer, (SearchIndex, FlatFile)) ... elif isinstance(layer, SearchSurface) ...
    • Failure mode: currently OK, but this is a footgun for future subclasses (anything inheriting SearchSurface but intended as a projection gets misfiled unless added to the first tuple). You’re encoding taxonomy by isinstance checks instead of explicit type/category.
    • Severity: [minor]
  5. Fingerprinting of uses is underspecified

    • Pattern: components["uses"] = fingerprint_value(sorted(layer.name for layer in self.uses))
    • Failure mode: if a surface’s modes/sources/embedding config changes but name stays same, transforms may not rebuild even though their retrieval behavior changes. That’s a cache correctness hole.
    • Severity: [warning]

Missing

  • Tests that catch the partial-surface cache lie: build a surface sourced from two layers, interrupt after first layer, verify cache metadata doesn’t mark it valid; verify downstream transform doesn’t use partial surface without explicit opt-in.
  • Validation: enforce Transform.uses only contains SearchSurface (or capability types), and reject uses cycles/unknown layers.
  • Documentation: explicit contract for surface materialization timing (built once when all sources complete vs progressive) and whether partial is allowed.
  • CLI behavior: synix clean implications for build/surfaces/* and whether surfaces are rebuilt automatically.

Verdict

Block. The progressive materialization + final-cache refresh is a correctness bug that will produce silently wrong retrieval behavior; fix the surface lifecycle semantics (build only when complete, or version partials separately) and tighten fingerprinting/validation before merging.

Review parameters
  • Model: gpt-5.2
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 3,881 lines
  • Prompt: .github/prompts/openai_review.md
  • Timestamp: 2026-03-06T20:50:34Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Architectural Review — Claude Opus | Blind review (docs + diff only)

Summary

This PR introduces SearchSurface as a first-class build-time capability declaration, reorganizes the module hierarchy so generic transforms (MapSynthesis, etc.) are canonical in synix.transforms while domain-specific memory transforms (EpisodeSummary, etc.) move to synix.ext, and adds a detailed RFC document (search-surface-rfc.md) laying out the full design vision for search surfaces, release targets, and transform context evolution.

Alignment

Strong alignment with DESIGN.md principles. The uses=[...] / depends_on split cleanly separates artifact flow from capability access — this respects the DAG-as-provenance model without polluting it with mutable side-effect dependencies. The RFC explicitly rejects "projection nodes as live DAG inputs" (Option B), which would have violated the immutable-artifact-DAG invariant. The SearchSurface fingerprint is included in compute_fingerprint via the uses component, preserving the "cache keys capture all inputs" principle. The module reorganization (generic patterns as platform, domain-specific as ext) advances Hypothesis 1 ("no one-size-fits-all") by making it clearer that memory-oriented transforms are just one composition of the generic primitives.

Observations

  1. [concern] Duplicated helper functions. _surface_local_path and _search_surface_handles are copy-pasted identically in both plan.py and runner.py. These should be extracted to a shared location to avoid drift.

  2. [concern] SearchIndex now inherits from SearchSurface but pipeline.add() routes SearchIndex to self.projections while SearchSurface goes to self.surfaces. The isinstance check order in add() is correct (SearchIndex checked first), but this inheritance relationship is fragile — a future refactor could easily break the routing. The isinstance checks scattered across runner.py and plan.py (isinstance(proj, (SearchIndex, SearchSurface))) multiply this risk.

  3. [concern] _open_search_surface_index reaches into SQLite internals (sqlite_master query, _get_conn() private method). This contradicts the RFC's own stated goal of keeping search handles backend-agnostic. The first implementation leaks the abstraction it's trying to establish.

  4. [question] Progressive materialization ordering. _materialize_layer_search_surfaces runs after each layer, but what guarantees the surface's source layers are built in the right order? If a surface sources from layers at different DAG levels, partial materialization happens with incomplete data. The final _refresh_surface_cache at the end seems to handle the cache correctly, but the intermediate progressive materializations could produce incomplete search indexes that a uses-consuming transform queries during the build.

  5. [positive] Test coverage is thorough. New unit tests for DAG registration, plan caching with surfaces, named surface materialization, and defensive fallbacks (missing/invalid surface). Integration test confirms surface DB appears at the right path and legacy search.db doesn't. Golden output files updated. CLI plan JSON verified to include surfaces key.

  6. [positive] Migration path is clean. Both synix.transforms and synix.ext re-export everything temporarily with clear docstrings about the migration direction. SearchIndex lowered to inherit from SearchSurface. search_db_path kept as a shim.

  7. [nit] The RFC at 1030 lines is longer than most implementation PRs. Consider whether it should live in docs/ at merge time or be tracked separately as an ADR.

  8. [nit] The golden stderr files for template 02 now elide traceback frames — that's a separate behavioral change that isn't explained in this PR.

  9. [question] Fingerprint stability. The uses fingerprint component is sorted(layer.name for layer in self.uses). If a surface is renamed, this invalidates caches for all consuming transforms. Is that intended? It seems correct but worth documenting.

Verdict

This is a well-structured incremental step that introduces the right abstraction boundary between build-time search capabilities and release targets, with good test coverage and a clean migration path — the main risks are the duplicated helpers and the leaky SQLite internals in the new search-surface opener.

Review parameters
  • Model: claude-opus-4-6
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 3,881 lines
  • Prompt: .github/prompts/claude_review.md
  • Timestamp: 2026-03-06T20:50:44Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Red Team Review — OpenAI GPT-5.2 | Adversarial review (docs + diff only)

Threat assessment — Medium-high risk: you’re introducing a new public concept (SearchSurface) and new pipeline graph semantics (uses) while simultaneously shuffling public module boundaries (synix.transforms vs synix.ext) and keeping compatibility shims that can silently misbehave.

One-way doors

  1. SearchSurface as a public API type (src/synix/core/models.py, exported in src/synix/__init__.py)

    • Why hard to reverse: Users will write pipelines with it; it becomes part of the mental model and serialized artifacts/caches may start keying off its behavior.
    • Safe to merge if: You freeze at least the minimal contract (field names: name, sources, modes, embedding_config; lifecycle expectations; where it materializes; whether it is “canonical” vs “compat”) and add a deprecation policy for SearchIndex.
  2. Transform.uses semantics + fingerprint inclusion (src/synix/core/models.py)

    • Why hard to reverse: Changing uses meaning later will invalidate caches and break pipelines that relied on search availability/order constraints.
    • Safe to merge if: You define strict semantics: what uses can contain (capabilities only), whether it affects execution order, and how it affects fingerprints and cache keys.
  3. Public module split: generic transforms in synix.transforms, opinionated memory transforms in synix.ext (README/docs/templates/tests)

    • Why hard to reverse: Import paths become muscle memory; third-party examples and blog posts calcify this.
    • Safe to merge if: You commit to a timeline + compatibility window and make sure both modules have stable re-exports (and warnings) with a single canonical home each.

Findings

  1. src/synix/core/models.py: Pipeline.add() classification bug

    • Pattern: if isinstance(layer, (SearchIndex, FlatFile)) ... elif isinstance(layer, SearchSurface) ...
    • Failure mode: SearchIndex is a subclass of SearchSurface, so SearchIndex never reaches the SearchSurface branch (fine), but a plain SearchSurface will be put in surfaces only; yet it still has depends_on edges (since it subclasses Layer) and can accidentally be treated as part of the DAG elsewhere if code iterates pipeline.layers + pipeline.projections and expects depends_on to be complete. This split DAG is fragile.
    • Severity: [warning]
  2. src/synix/build/runner.py: search surfaces are materialized via _materialize_projection() and reuse search_index projection

    • Pattern: treating SearchSurface as proj_type="search_surface" but still calling get_projection("search_index", ..., db_path)
    • Failure mode: You claim surfaces are “build-time capabilities” but implement them as just another SQLite projection. That’s not inherently wrong, but it contradicts the RFC’s “opaque + canonical state” direction and risks users starting to rely on .db layout anyway.
    • Severity: [warning]
  3. src/synix/build/search_surfaces.py: validation uses _level ordering on sources

    • Pattern: blocking_sources = ... if source._level >= layer._level
    • Failure mode: _level is an internal scheduling artifact. If levels are recomputed differently (or equal levels occur), you can reject valid pipelines or allow invalid ones. Also, surfaces are not in the build DAG; using DAG levels to validate capability readiness is a leaky abstraction.
    • Severity: [warning]
  4. src/synix/build/llm_transforms.py: _open_search_surface_index() swallows errors and silently disables retrieval

    • Pattern: except sqlite3.Error: logger.warning(...); return None
    • Failure mode: TopicalRollup behavior changes silently from “retrieval-assisted clustering” to “use all episodes for every topic” depending on file presence/schema. That is a semantic change with no hard signal to user, and can blow up cost/tokens.
    • Severity: [critical]
  5. Fingerprinting drift risk: Transform.compute_fingerprint() includes uses signatures, but runtime injects search_surfaces handles into config

    • Pattern: components["uses"] = fingerprint_value(use_signatures) plus config also contains search_surfaces/search_surface/search_db_path
    • Failure mode: If compute_fingerprint() also fingerprints config (likely), you may inadvertently include build-dir-specific db_path strings (non-hermetic) in the fingerprint, causing cache misses across machines/paths. You’re trying to separate “identity” from “realization,” but config injection reintroduces realization.
    • Severity: [critical] (because it undermines incremental rebuild guarantees)
  6. Docs/README inconsistency

    • Pattern: README says generic transforms are in synix.transforms and bundled in synix.ext, but synix.ext still re-exports the generic transforms “temporarily”.
    • Failure mode: Users won’t know which import path is canonical; tutorials will diverge; deprecation later becomes painful.
    • Severity: [minor]

Missing

  • Tests asserting fingerprint stability w.r.t. absolute paths for search surface handles (db_path) and build_dir changes.
  • A hard failure mode or explicit flag when a transform declares uses=[surface] but the surface can’t be opened / has wrong schema. At least make it opt-in strict (--strict-surfaces or config).
  • Documentation of the uses contract in docs/pipeline-api.md beyond a single example: what capabilities exist, how they are validated, and how they affect caching.
  • Concurrency/race tests for surface DB materialization (build/surfaces/*.db) under -j builds.

Verdict

Block. The direction is reasonable, but the current implementation has silent semantic fallbacks and likely non-hermetic fingerprinting via injected db_path, which will cause unpredictable rebuild behavior and cost explosions. Fix those and rerun the design review.

Review parameters
  • Model: gpt-5.2
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 4,073 lines
  • Prompt: .github/prompts/openai_review.md
  • Timestamp: 2026-03-06T21:00:25Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Architectural Review — Claude Opus | Blind review (docs + diff only)

Summary

This PR introduces SearchSurface as a first-class pipeline concept, separating build-time search capabilities from projection/release concerns. It reorganizes the module hierarchy so generic transforms (MapSynthesis, etc.) live in synix.transforms while domain-specific bundled transforms (EpisodeSummary, etc.) move to synix.ext. It includes a detailed RFC document, model changes, planner/runner integration, template updates, and comprehensive tests.

Alignment

Strong fit with the project vision. DESIGN.md §3.3 requires cache keys to capture all inputs — this PR adds uses to the fingerprint (models.py compute_fingerprint). The RFC's separation of logical searchable state from mutable realization aligns with §1.5's immutable build history split. The depends_on vs uses distinction is a clean expression of DESIGN.md's DAG model: artifact flow stays in the dependency graph, capability access is declared separately. The entity-model update correctly reframes the "projections as DAG nodes" north star (§6.1) away from the rejected Option B toward explicit capability declarations — consistent with Hypothesis 3 (architecture is a runtime concern).

Observations

  1. [positive] validate_search_surface_uses in search_surfaces.py catches ordering violations at plan time — prevents a transform from consuming a surface whose sources haven't been built yet. Good defensive check with clear error messages.

  2. [positive] Extensive test coverage: DAG registration test, surface-waits-for-all-sources test, plan caching after build, rejection of future-layer dependencies, CLI output golden file updates, integration tests for the named surface path. Both happy paths and edge cases are covered.

  3. [concern] _open_search_surface_index in llm_transforms.py silently returns None on failure, and TopicalRollup.split() falls back to using all episodes. This is the right defensive behavior, but the sqlite3.Error catch with only a logger.warning could mask real configuration bugs in production. Consider logging at a higher level or adding a --strict mode.

  4. [concern] The module reorganization creates a bidirectional re-export situation: synix.transforms re-exports from synix.ext, and synix.ext re-exports generic transforms from synix.ext.* plus bundled transforms from synix.build.llm_transforms. The transforms/summarize.py shim now imports from synix.ext which imports from synix.build.llm_transforms. Three levels of indirection during migration could confuse contributors. The docstrings help, but a migration timeline would be better.

  5. [question] SearchSurface extends Layer and gets added via pipeline.add(), but is routed to pipeline.surfaces instead of pipeline.layers. Since SearchIndex now inherits from SearchSurface, it hits the isinstance(layer, (SearchIndex, FlatFile)) check first in Pipeline.add() and goes to projections. This ordering dependency in add() is subtle — if someone reorders the isinstance checks, SearchIndex could land in surfaces instead. Worth a comment or an explicit type check.

  6. [positive] The RFC document is thorough — options considered, rejected alternatives explained, migration strategy, implementation phases, test strategy. This is good decision documentation for a pre-1.0 project.

  7. [nit] CLAUDE.md update describes llm_transforms.py as "Bundled memory transforms + shared LLM helper functions" but the shared helpers (_make_llm_client, _logged_complete, _open_search_surface_index) are growing. Consider splitting helpers into their own module before the next major change.

  8. [positive] The usage_signature() method on SearchSurface provides a stable, JSON-serializable identity for fingerprinting — clean extension point.

  9. [question] ShadowIndexManager.__init__ now does self.main_path = target if target.suffix else target / "search.db". The suffix check is fragile — a path like /tmp/build.v2/surfaces/ would match .v2 as a suffix. Consider checking specifically for .db.

  10. [nit] Golden file diffs for stderr traces (validate_initial.stderr.txt) remove intermediate stack frames. If these are normalized by the test harness, fine — but if they represent actual output changes, they should be called out.

Verdict

This is a well-designed incremental step that correctly separates build-time search capabilities from release concerns, with strong test coverage and a clear migration path — a good foundation for the RFC's later phases.

Review parameters
  • Model: claude-opus-4-6
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 4,073 lines
  • Prompt: .github/prompts/claude_review.md
  • Timestamp: 2026-03-06T21:00:30Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Red Team Review — OpenAI GPT-5.2 | Adversarial review (docs + diff only)

Threat assessment — Medium-high risk: this PR introduces new public API concepts (SearchSurface, TransformContext, uses=[...]) and quietly changes transform execution contracts; it’s directionally good but not yet safe.

One-way doors

  1. Transform.execute(..., ctx: TransformContext) / split(..., ctx) signature change

    • Hard to reverse: every third-party/custom transform will key off this; reversing later means dual-support forever or mass breakage.
    • Safe to merge if: there’s a documented migration policy, runtime adapter supports both call styles robustly (old dict + new ctx), and tests cover legacy custom transforms.
  2. SearchSurface + uses=[SearchSurface] as the build-time retrieval dependency mechanism

    • Hard to reverse: it becomes the canonical mental model for “retrieval during build.” Changing semantics later breaks pipelines and cached fingerprints.
    • Safe to merge if: the semantics are nailed down (surface lifecycle, caching, availability rules), and you’re not also promising the RFC’s “canonical surface state” while only implementing “sqlite db in build/”.
  3. Artifact/materialization path: build/surfaces/<name>.db

    • Hard to reverse: users will script around these paths even if you say “opaque.”
    • Safe to merge if: CLI/docs explicitly say it’s unstable + provide a supported API to access it (ctx.search only), and the code stops leaking the path via search_db_path as a “default”.

Findings

  1. src/synix/core/models.pyPipeline.add() classification bug for SearchIndex

    • SearchIndex now subclasses SearchSurface, but Pipeline.add() checks (SearchIndex, FlatFile) before SearchSurface, so every SearchIndex is treated as a projection, not a surface. That may be intended, but it collides with validate_search_surface_uses() which treats pipeline.projections as a place SearchSurface can live. You now have two registration channels with overlapping types. This is fragile and will confuse both planner and users.
    • Severity: [warning]
  2. src/synix/build/search_surfaces.py — validation uses object identity for registration

    • used not in registered_surfaces compares layer objects, not names. If a user constructs an equivalent SearchSurface("episode-search", ...) object twice (common in Python pipelines), uses=[surface_a] and pipeline.add(surface_b) will fail.
    • Severity: [critical]
  3. src/synix/build/runner.py — surface materialization is still just projections, contradicting RFC

    • Despite the RFC claiming “canonical immutable surface state in .synix”, implementation is a mutable sqlite file in build/. That’s fine as an intermediate step, but then don’t present it as the new conceptual model in docs (pipeline-api.md calls it “build-time capability declaration”). Today it’s still a projection with a different trigger.
    • Severity: [warning]
  4. src/synix/build/plan.pyTransformContext constructed with runtime fields affects split() behavior

    • Planner calls layer.split(inputs, runtime_ctx) where runtime_ctx includes workspace + injected search handles. That means split() can now depend on filesystem state and differ between plan/run if the surface DB exists or not. You added tests for caching, but you didn’t pin the invariant “plan must not depend on mutable build-dir side effects.”
    • Severity: [warning]
  5. src/synix/core/models.py — transform fingerprint now includes uses with embedding_config and config

    • This is correct for correctness, but it’s a cache invalidation grenade: any non-fingerprint-stable values in embedding_config / config (dict ordering handled, but values can be objects) can cause nondeterministic digests or serialization surprises (default=str).
    • Severity: [warning]
  6. src/synix/core/search_handles.pySearchSurfaceHandle.is_available() is expensive and racy

    • It opens sqlite, checks tables. Called during resolve_search_surface_handle() potentially per transform invocation. On large builds, repeated open/close will hurt. Also races with concurrent materialization if you ever parallelize surfaces.
    • Severity: [minor]
  7. Docs/website inconsistency

    • Website copy still frames “SearchIndex + FlatFile projections” as first-class; README now moves generic transforms to synix.transforms and “bundled ext” to synix.ext. DESIGN.md still describes older pipeline API (function prompts, etc.). You’re accumulating contradictory public narratives.
    • Severity: [warning]

Missing

  • Backward-compat tests for custom transforms still using execute(inputs, config: dict) and mutating config (common pattern). You updated internal/bundled transforms but didn’t prove external transforms won’t break.
  • Name-based surface registration (or canonical surface IDs) to avoid identity pitfalls.
  • Clear deprecation plan: how long will search_db_path exist; when does synix.ext stop re-exporting platform transforms; what’s stable pre-1.0?
  • Concurrency story for surface DBs if transforms in parallel read/write or if multiple builds run against same build_dir.

Verdict

Block. The object-identity registration bug in validate_search_surface_uses() is a real footgun, and the API shift is too large without robust legacy adaptation + documentation of stability/deprecation. Fix those, then re-review.

Review parameters
  • Model: gpt-5.2
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,500 lines
  • Prompt: .github/prompts/openai_review.md
  • Timestamp: 2026-03-06T21:40:26Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Architectural Review — Claude Opus | Blind review (docs + diff only)

Summary

This PR introduces SearchSurface as a first-class pipeline concept, replaces the implicit search_db_path convention with explicit uses=[surface] declarations on transforms, migrates all transform signatures from config: dict to ctx: TransformContext, and reorganizes the module hierarchy so generic transforms (MapSynthesis, etc.) live in synix.transforms while bundled memory transforms (EpisodeSummary, etc.) move to synix.ext. An accompanying RFC document lays out the full design rationale.

Alignment

Strong fit. DESIGN.md §3.3 says cache keys must capture all inputs — this PR adds uses signatures to fingerprints, ensuring search surface identity participates in cache invalidation. The TransformContext replacing raw dicts aligns with the Python-first principle (§4.1): typed context over bag-of-strings config. The SearchSurface/uses split between artifact flow (depends_on) and capability access (uses) respects the DAG model without making local realizations into recursive build inputs — exactly what the RFC argues. The module reorganization (generic platform transforms vs. opinionated memory transforms) advances the extension model hypothesis: pipeline authors get clean primitives, domain-specific transforms are opt-in.

Observations

  1. [positive] validate_search_surface_uses in search_surfaces.py catches level-ordering violations at plan time, preventing transforms from consuming surfaces whose sources haven't built yet. Good fail-fast behavior with clear error messages.

  2. [positive] TransformContext implements MutableMapping, preserving backward compatibility with dict-style access while exposing typed properties (ctx.llm_config, ctx.search()). The from_value factory and with_updates immutable merge pattern are clean.

  3. [concern] SearchSurfaceHandle.query() opens and closes a SearchIndex connection on every call. In TopicalRollup.split(), this means N separate open/close cycles (one per topic). For pipelines with many topics, this could cause measurable overhead. The old code opened once and closed once.

  4. [concern] SearchIndex now subclasses SearchSurface (line in models.py), but Pipeline.add() checks isinstance(layer, (SearchIndex, FlatFile)) before isinstance(layer, SearchSurface). Since SearchIndex is a SearchSurface, the ordering matters — SearchIndex lands in projections, not surfaces. This is correct but fragile; a reorder of the isinstance checks would silently break routing.

  5. [question] The _RUNTIME_ONLY_KEYS frozenset on TransformContext filters keys for the .config property, but there's no mechanism to prevent new runtime keys from leaking into user-facing config. If a future contributor adds "_foo" to the context, it will appear in ctx.config unless manually added to the frozenset.

  6. [concern] ShadowIndexManager.__init__ changed from build_dir to target_path semantics with a heuristic (target.suffix). If someone passes a path without a suffix that isn't a directory, self.main_path becomes that path directly rather than appending search.db. This feels brittle for edge cases.

  7. [positive] Tests are thorough — new unit tests for surface registration in the DAG, plan-time validation of surface ordering, integration test for named surface materialization, and the behavioral change from silent fallback to hard error on missing surfaces is explicitly tested (SearchSurfaceUnavailableError).

  8. [nit] Both synix.transforms and synix.ext re-export each other's symbols for compatibility, creating a circular-feeling import graph. The docstrings say "temporarily" — worth tracking as tech debt.

  9. [positive] The golden file updates and E2E test rewrites confirm that CLI output changes (new Surfaces: line) are covered end-to-end.

  10. [question] The RFC is 1,030 lines and included in the diff as a shipped doc. Is this intended as living documentation or should it be moved to a decisions/ADR directory before it becomes stale?

Verdict

This is a well-structured incremental step that cleans up a real architectural smell (implicit search_db_path) with a principled replacement, introduces the right abstractions at the right boundaries, and maintains backward compatibility — a good PR for the project's current stage.

Review parameters
  • Model: claude-opus-4-6
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,500 lines
  • Prompt: .github/prompts/claude_review.md
  • Timestamp: 2026-03-06T21:40:35Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Red Team Review — OpenAI GPT-5.2 | Adversarial review (docs + diff only)

Threat assessment — Medium-high risk: this PR quietly introduces new public API surface area (SearchSurface, TransformContext, uses=) while still carrying legacy behaviors, and there are some correctness holes around surface identity, caching, and runtime/path semantics.

One-way doors

  1. Transform.execute(inputs, ctx) / split(inputs, ctx) signature change

    • Hard to reverse because every custom transform will implement this; even with “mapping-compatible” shims, you’re effectively committing to the context object model.
    • Safe to merge when: you publish an explicit deprecation/migration timeline, keep a working adapter for config: dict transforms (with tests proving it across runner/plan/batch), and document the supported subset of legacy behavior.
  2. uses=[SearchSurface(...)] as the capability dependency mechanism

    • Hard to reverse because it becomes the mental model and pipeline authoring style; renaming/semantics changes will break pipelines.
    • Safe to merge when: you define strict semantics (ordering, fingerprinting, optional vs required, multi-surface disambiguation) in docs (not just RFC) and add validation/tests for the edge cases.
  3. On-disk materialization path build/surfaces/{name}.db

    • Hard to reverse because templates/tests and users will start relying on it (even if you say “internal”).
    • Safe to merge when: CLI/docs explicitly label it unstable and you provide an official API for build-time querying (which you partly do via ctx.search) and avoid encouraging path usage.

Findings

  1. src/synix/core/models.pyPipeline.add() classification bug (SearchIndex is a SearchSurface)

    • Pattern: if isinstance(layer, (SearchIndex, FlatFile)) ... elif isinstance(layer, SearchSurface) ...
    • Failure mode: SearchIndex(...) will match the SearchIndex branch, but it’s also a SearchSurface subclass now; you’re maintaining two parallel registries (pipeline.projections and pipeline.surfaces) with overlapping types. Expect subtle bugs when someone passes a SearchIndex where a SearchSurface is expected, or when validation enumerates both lists.
    • Severity: [critical]
  2. src/synix/build/search_surfaces.py — surface registration/validation includes projections

    • Pattern: registered_surfaces = {layer.name: layer for layer in [*pipeline.surfaces, *pipeline.projections] if isinstance(layer, SearchSurface)}
    • Failure mode: a SearchIndex (projection) can satisfy a uses=[SearchSurface("x", ...)] requirement by name/signature, which defeats the “capability vs projection” separation you’re trying to establish and reintroduces the old coupling.
    • Severity: [warning]
  3. src/synix/build/runner.py + src/synix/build/plan.py — inconsistent “surface cache” semantics

    • Pattern: surfaces are materialized incrementally in-run (_materialize_layer_search_surfaces) but the cache metadata is only refreshed at end (_refresh_surface_cache), and planning uses projection-cache-style hashes.
    • Failure mode: mid-build failures can leave a surface DB present but cache metadata stale/missing; next plan/run may misclassify cached/new and create confusing rebuild behavior.
    • Severity: [warning]
  4. src/synix/core/search_handles.pySearchSurfaceHandle.is_available() opens DB repeatedly

    • Pattern: every ctx.search(..., required=...) calls is_available() which opens sqlite and checks table existence.
    • Failure mode: for transforms that query frequently or resolve surfaces often, you’ve introduced extra SQLite open/close churn. This will become noticeable at scale and on network filesystems.
    • Severity: [minor]
  5. src/synix/build/plan.py_plan_layer constructs TransformContext with workspace.build_dir as str(Path(pipeline.build_dir))

    • Failure mode: mixed usage of build_dir (sometimes build_dir param, sometimes pipeline.build_dir) can desync when callers override build dir. This sort of path confusion becomes a persistent source of “works in plan, fails in run” bugs.
    • Severity: [warning]
  6. Docs drift / public story conflict

    • README now says platform transforms are in synix.transforms and ext contains “bundled” memory transforms, but synix.transforms/__init__.py still re-exports the bundled ones “temporarily”. That’s exactly the kind of half-migration that users will cargo-cult into permanent usage.
    • Severity: [minor]

Missing

  • A hard guarantee/test matrix for backward compatibility: you added a couple “legacy config-style” tests, but I expected explicit tests covering:

    • legacy split(inputs, dict) + legacy execute(inputs, dict) under batch runner too
    • transforms with uses=[] but still reading search_db_path (should be supported or explicitly deprecated)
    • multiple surfaces in uses=[...] requiring explicit surface= selection and error messaging
  • Documentation of SearchSurface as shipped API (not RFC): pipeline-api mentions it, but you need a stable contract section: required fields, whether .db exists, how caching works, what guarantees exist for modes beyond fulltext.

Verdict

Ship with fixes — block on the SearchIndex/SearchSurface type split ambiguity and validation semantics; once that’s resolved and you add the missing compatibility tests + clarify the public contract, the direction is defensible.

Review parameters
  • Model: gpt-5.2
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,668 lines
  • Prompt: .github/prompts/openai_review.md
  • Timestamp: 2026-03-06T21:53:13Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Architectural Review — Claude Opus | Blind review (docs + diff only)

Summary

This PR introduces SearchSurface as a first-class pipeline concept, TransformContext as a typed runtime context replacing raw config dicts, and reorganizes the module hierarchy so generic transforms (MapSynthesis, etc.) live in synix.transforms while bundled memory transforms (EpisodeSummary, etc.) move to synix.ext. It includes an RFC document, validation logic for surface usage ordering, and comprehensive test updates.

Alignment

Strong alignment. The RFC and implementation directly address a gap the DESIGN.md identified: transforms consuming search during builds relied on implicit search_db_path side effects rather than declared dependencies. The new uses=[surface] declaration makes build-time capabilities explicit without polluting depends_on (artifact flow), which respects the DAG-based provenance model. Cache keys now include uses signatures (via compute_fingerprint), consistent with the DESIGN.md principle that materialization keys must capture all inputs. The TransformContext mapping-compatibility preserves the Python-first, experiment-friendly ethos—legacy transforms keep working.

Observations

  1. [positive] validate_search_surface_uses in search_surfaces.py:62-96 enforces that a transform cannot use a surface whose sources haven't been built yet, using DAG level comparison. This is exactly the kind of build-order safety a build system needs.

  2. [positive] TransformContext implements MutableMapping so all existing config["key"] / config.get("key") / config.copy() patterns keep working. The legacy compatibility tests (TestLegacyCustomTransformCompatibility) explicitly verify this.

  3. [concern] SearchSurfaceHandle.query() opens and closes a SearchIndex connection on every call (search_handles.py:131-153). For TopicalRollup.split() iterating over N topics, this is N open/close cycles. At scale (many topics or large DBs), this could cause noticeable overhead. Consider a context-managed or cached connection pattern.

  4. [concern] The SearchIndex inheritance change (SearchIndex(SearchSurface) in models.py:288) means isinstance(x, SearchSurface) now matches SearchIndex instances. The Pipeline.add() method checks SearchIndex before SearchSurface due to MRO, but _materialize_layer_search_surfaces iterates pipeline.surfaces—which SearchIndex never enters since it's caught first by the (SearchIndex, FlatFile) check. This is correct but fragile; reordering the isinstance checks in add() would silently break routing.

  5. [concern] ShadowIndexManager.__init__ changed to accept target_path and infer whether it's a file or directory (indexer.py:271-274). The heuristic target.suffix is empty for paths like /tmp/build/surfaces/search without extension—this would incorrectly append /search.db. The callers currently always pass .db paths, but the implicit behavior is surprising.

  6. [question] The module reorganization creates bidirectional re-exports: synix.transforms re-exports from synix.ext, and synix.ext re-exports from synix.transforms (the generic ones). Is there a planned deprecation timeline? The CLAUDE.md update and docstrings say "temporarily," but without a version target this could persist indefinitely.

  7. [nit] _RUNTIME_ONLY_KEYS in TransformContext is a hardcoded frozen set. New runner-injected keys must be added here manually or prefixed with _. The _is_runtime_only_key method handles underscore-prefixed keys, but workspace and search_surfaces require explicit listing—easy to forget.

  8. [positive] The test coverage is thorough: DAG registration, surface ordering validation, mismatched surface rejection, surface wait-for-all-sources, legacy config compatibility, plan JSON output including surfaces, and the topical rollup's behavior shift from silent fallback to hard error. The golden output files are updated consistently.

  9. [positive] The RFC (search-surface-rfc.md) is well-structured with options considered, rejected alternatives, and explicit non-goals. It separates concerns cleanly: logical capability vs. local realization vs. release target.

  10. [nit] The get_context call at the top of every execute/split method (ctx = self.get_context(ctx)) is boilerplate. Consider handling this in a base-class wrapper or decorator.

Verdict

This is a well-executed architectural step that replaces an implicit side-effect dependency with an explicit, validated, fingerprinted capability model—exactly the kind of rigor a build system needs. Ship it, with a note to address the per-query connection overhead before templates with many topics hit production scale.

Review parameters
  • Model: claude-opus-4-6
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,668 lines
  • Prompt: .github/prompts/claude_review.md
  • Timestamp: 2026-03-06T21:53:25Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Red Team Review — OpenAI GPT-5.2 | Adversarial review (docs + diff only)

Threat assessment — High risk: this PR quietly introduces new public API concepts (SearchSurface, uses, TransformContext) and changes transform invocation semantics across runner/plan/batch, with multiple backwards-compat shims that can mask breakage.

One-way doors

  1. Transform.execute(..., ctx: TransformContext) / split(..., ctx) signature change

    • Hard to reverse because every custom transform implementation will couple to this call shape.
    • Safe to merge only if: (a) you actually support old-style execute(inputs, config: dict)/split(inputs, config) reliably, (b) you document the migration window and deprecation policy, and (c) you add compatibility tests that cover “old transform with strict dict assumptions” in runner and batch runner.
  2. SearchSurface + uses=[...] as a new pipeline modeling concept

    • Hard to reverse because it becomes part of pipeline authoring mental model and docs/templates.
    • Safe only if: you commit to the conceptual separation long-term (capability vs artifact), and you lock down minimal semantics (naming, lifecycle, caching rules) + error model.
  3. On-disk layout: build/surfaces/{surface}.db

    • You’re creating a new persisted artifact users will discover and script against.
    • Safe only if: you clearly declare it “compatibility/internal” and provide a stable CLI/API way to query surfaces without path dependence (otherwise people will depend on the path anyway).

Findings

  1. src/synix/core/models.py: signature mismatch vs “legacy compatibility” claims

    • Pattern: class LegacyConfigTransform(Transform): def execute(..., config: dict) in tests, but base class now requires execute(..., ctx: TransformContext).
    • Failure mode: Python will not enforce interface, but your runner always calls execute(unit_inputs, unit_ctx); legacy transforms that type-check or isinstance(config, dict) will break. Your tests don’t cover strict dict expectations, only .copy() and mutation.
    • Severity: [critical]
  2. src/synix/build/runner.py + _execute_transform_concurrent: context downgraded to dict then re-wrapped

    • Pattern: base_ctx = TransformContext.from_value(config); base_config = base_ctx.to_dict(); ... worker_ctx = TransformContext.from_value(worker_config)
    • Failure mode: runtime-only handles (like SearchSurfaceHandle) are mapping types; you’re deep-copying base_config which may copy non-picklable objects. You only exclude _logger, not _shared_llm_client or potentially search handles. Also you now create _shared_llm_client in worker_config, but you didn’t exempt it from deepcopy earlier.
    • Severity: [warning]
  3. src/synix/build/search_surfaces.py: validate_search_surface_uses is too restrictive

    • Pattern: for used in layer.uses: if not isinstance(used, SearchSurface): raise TypeError(...)
    • Failure mode: You just made uses unusable for any future capability type (even internal ones). That’s a design trap: you’ll either keep widening isinstance checks or break users later.
    • Severity: [warning]
  4. src/synix/build/runner.py: “Search surfaces only materialize once full source closure is available” is not actually enforced by dependency graph

    • Pattern: SearchSurface is not in pipeline.layers DAG; you trigger materialization opportunistically per-layer.
    • Failure mode: race/order bugs when surfaces source multiple layers and concurrency is involved; also “ready” is computed from available_names = set(layer_artifacts) which is a dict of layers seen, not “built successfully in this run”. If a layer exists from prior build but current run failed mid-way, you can materialize a surface over stale+new mix.
    • Severity: [critical]
  5. src/synix/core/search_handles.py: SearchSurfaceHandle.is_available() opens DB and checks table

    • Failure mode: called during transform execution resolution; this adds filesystem I/O + sqlite open per lookup. In pipelines with many work units, this becomes hot-path overhead and will scale poorly.
    • Severity: [warning]
  6. Docs/API churn: synix.transforms vs synix.ext flip-flop

    • Pattern: README changed to import generic transforms from synix.transforms, but code still re-exports memory transforms there “temporarily”.
    • Failure mode: users won’t know canonical imports; you’re institutionalizing a muddy public surface right before 1.0.
    • Severity: [warning]

Missing

  • Hard compatibility tests for legacy transforms that:
    • require dict (e.g., assert type(config) is dict)
    • rely on in/keys() semantics exactly
    • expect config to be fingerprint-stable (no runtime keys)
  • Failure-mode tests for partial build + surface materialization correctness (surface must not update if upstream layer build failed).
  • Concurrency tests where a transform uses a surface and runner is -j > 1 (prove no sqlite locking/path contention).
  • Documentation explicitly stating whether build/surfaces/*.db is stable or internal, and how it interacts with .synix/ snapshots (right now it looks like more mutable “source of truth” leakage).

Verdict

Block — the surface concept is plausible, but the current implementation mixes “capability” with “mutable sqlite file” and has correctness holes around partial builds + a shaky backward-compat story for custom transforms.

Review parameters
  • Model: gpt-5.2
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,712 lines
  • Prompt: .github/prompts/openai_review.md
  • Timestamp: 2026-03-06T22:09:11Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Architectural Review — Claude Opus | Blind review (docs + diff only)

Summary

This PR introduces SearchSurface as a first-class build-time capability, TransformContext as a typed runtime context replacing raw config dicts, and reorganizes the module layout so generic transforms (MapSynthesis, etc.) live in synix.transforms while bundled memory transforms (EpisodeSummary, etc.) move to synix.ext. It includes a 1,030-line RFC document, updates all templates/tests, and adds validation for surface usage ordering.

Alignment

Strong fit. DESIGN.md says cache keys must capture all inputs — this PR adds uses signatures to the fingerprint (models.py L249-256). The RFC's separation of logical searchable state from local realization directly advances "architecture is a runtime concern" by making search backend-neutral. The depends_on vs uses distinction keeps the provenance DAG clean: artifact flow stays in depends_on, capability access is explicit. This is consistent with the immutable/content-addressed artifact model — search surfaces are not treated as upstream artifacts that would muddy the DAG.

Observations

  1. [positive] validate_search_surface_uses (search_surfaces.py L57-99) catches ordering violations, mismatched declarations, and the SearchIndex-as-surface mistake at plan time with clear error messages. Excellent defensive design.

  2. [positive] TransformContext implements MutableMapping so legacy config dict code keeps working. The compatibility tests (test_transforms.py TestLegacyCustomTransformCompatibility) explicitly verify .copy(), mutation, and split() with old signatures.

  3. [concern] SearchSurfaceHandle.query() opens and closes a SearchIndex connection on every call (search_handles.py L131-155). In TopicalRollup.split(), this happens once per topic. With 20+ topics, that's 20+ open/close cycles on the same SQLite file. Consider caching the connection or accepting a context manager pattern.

  4. [concern] The module reorganization creates a confusing circularity: synix.transforms re-exports from synix.ext, and synix.ext re-exports from synix.build.llm_transforms. Both packages export all names from the other during migration. A new contributor reading from synix.transforms import EpisodeSummary then from synix.ext import EpisodeSummary would not know which is canonical without reading both __init__.py files.

  5. [question] _build_transform_config still builds a fingerprint-stable dict, while _build_transform_context wraps it with runtime capabilities. But compute_fingerprint is called on the raw config dict, not the context — is it intentional that the uses component is computed from self.uses (the layer attribute) rather than from the runtime context? This seems correct but the dual-path is subtle.

  6. [nit] The RFC (search-surface-rfc.md) is over 1,000 lines and lives in docs/ as a committed file. It mixes resolved decisions with deferred questions. Consider splitting the resolved spec from the open questions to reduce reader confusion.

  7. [positive] Test coverage is thorough: unit tests for validation edge cases (mismatched surface, projection-not-surface, future-layer ordering), integration test for named surface materialization, plan caching after build, and golden file updates for CLI output changes.

  8. [concern] surface_local_path puts surfaces under build/surfaces/. Since synix clean removes the build directory, surface realizations are ephemeral — correct per the RFC. But _refresh_surface_cache writes to the projection cache, which also lives in build/. If the cache entry survives but the DB doesn't (partial clean), the next plan could report "cached" for a missing surface.

  9. [nit] ShadowIndexManager.__init__ changed from build_dir to target_path with heuristic suffix detection (indexer.py L271-274). The if target.suffix branch means passing a directory without a suffix still works, but passing something like build/surfaces/episode-search (no .db) would produce unexpected paths.

Verdict

This is a well-structured incremental step that cleanly separates build-time search capabilities from projection outputs, with strong backward compatibility and thorough test coverage — the connection-per-query pattern in SearchSurfaceHandle is the main thing to address before this scales.

Review parameters
  • Model: claude-opus-4-6
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,712 lines
  • Prompt: .github/prompts/claude_review.md
  • Timestamp: 2026-03-06T22:09:19Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Red Team Review — OpenAI GPT-5.2 | Adversarial review (docs + diff only)

Threat assessment — Medium risk: it introduces new public API concepts (SearchSurface, TransformContext, uses=[...]) and changes transform invocation semantics across runner, planner, and batch runner, with several backward-compat shims that look brittle.

One-way doors

  1. TransformContext + new transform signatures (execute(inputs, ctx), split(inputs, ctx))

    • Why hard to reverse: Custom transforms will start depending on this typed context and ctx.search(...). Rolling it back would break user code.
    • Safe if: You explicitly version the transform API (even pre-1.0), document migration guarantees, and add compatibility tests for both signatures across sync + concurrent + batch + plan.
  2. SearchSurface + uses=[...] capability graph

    • Why hard to reverse: This becomes the mental model for “build-time capabilities” vs DAG dependencies; templates and docs now teach it.
    • Safe if: You commit to the semantic meaning of uses (not just search), define how it affects caching/fingerprints, and clarify lifecycle (canonical vs local) beyond “sqlite in build/surfaces”.
  3. On-disk convention: build/surfaces/<name>.db

    • Why hard to reverse: Users will script around it despite “internal” intent, especially since you show it in CLI status and tests.
    • Safe if: Provide an official programmatic way to locate/open surfaces and keep CLI from implying stability (or accept it as stable and treat it as such).

Findings

  1. src/synix/build/runner.py: _materialize_layer_search_surfaces uses _materialize_projection (SearchIndexProjection) to build surface DBs

    • Failure mode: You’re not actually implementing “canonical immutable surface state” from the RFC; you’re just creating more SQLite projections with different paths. This is a conceptual mismatch that will cause future migration pain and undermines the “not a projection” claim.
    • Severity: [warning]
  2. src/synix/build/search_surfaces.py: SearchSurfaceHandle is just {db_path, modes, sources} and queries via SQLite

    • Failure mode: The “backend-neutral” abstraction is fake right now; transforms are still coupled to the SQLite schema via synix.search.indexer.SearchIndex and the search_index table name. Any future backend swap will break transforms that call handle.query.
    • Severity: [warning]
  3. src/synix/build/plan.py: _plan_layer builds a TransformContext with workspace and runtime search injections but fingerprint uses transform_config (not ctx)

    • Failure mode: If a transform’s behavior depends on runtime-injected fields (like search_surfaces/search_db_path), plan may not reflect reality for cache keys. You partly mitigated by adding uses to fingerprint, but the actual surface contents/path are runtime and not in fingerprint. That’s a recipe for “cached” when execution would differ.
    • Severity: [warning]
  4. src/synix/core/models.py: TransformContext._RUNTIME_ONLY_KEYS includes llm_config

    • Failure mode: ctx.config strips llm_config, so any transform that previously read config["llm_config"] from “user config” semantics will now see it missing unless they switch to ctx.llm_config. That’s an easy silent behavioral change for custom transforms.
    • Severity: [warning]
  5. src/synix/build/runner.py: legacy signature detection via _transform_prefers_legacy_config_dict

    • Failure mode: Annotation- and parameter-name-based heuristics are unreliable. Plenty of user code won’t annotate, or will name the param something else, or will use Mapping/dict[str, Any]. Mis-detection will either pass a dict where a ctx is expected or vice versa.
    • Severity: [critical]
  6. src/synix/core/models.py: Pipeline.add() now separates layers vs surfaces vs projections

    • Failure mode: This is a real semantic change: surfaces are excluded from DAG ordering. Any code that assumed “all Layer subclasses are in pipeline.layers” will miss them. You added tests for this, but I see no audit of other modules that iterate pipeline.layers expecting completeness.
    • Severity: [warning]
  7. Docs inconsistencies: README now says platform transforms are synix.transforms, but src/synix/ext still re-exports them

    • Failure mode: You now have two “truths” and you will get users mixing imports. Deprecation policy is not stated; re-exports will calcify.
    • Severity: [minor]

Missing

  • A hard deprecation plan for search_db_path and SearchIndex as “capability”. You still inject search_db_path for legacy projections; that will linger forever unless you timebox it.
  • Tests covering heuristic signature detection failures: unannotated methods, config: Mapping, ctx param named config, etc. Right now you only test the happy path (“strict dict”).
  • Concurrency tests for surfaces: multiple surfaces materializing, or concurrent build and query. SQLite file creation races are plausible.
  • Documentation updates for website content (it still markets transforms/imports inconsistently; also the landing page text appears encoding-corrupted).

Verdict

Ship with fixes — block on the legacy signature detection brittleness and clarify surface semantics (either admit “surfaces are just additional SQLite indexes for now” or implement the canonical/realization split you’re claiming).

Review parameters
  • Model: gpt-5.2
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,830 lines
  • Prompt: .github/prompts/openai_review.md
  • Timestamp: 2026-03-06T22:24:22Z

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Note

Architectural Review — Claude Opus | Blind review (docs + diff only)

Summary

This PR introduces SearchSurface as a first-class build-time searchable capability, replaces the implicit search_db_path convention with explicit uses=[surface] declarations on transforms, adds a TransformContext object to replace raw config: dict in the transform API, and reorganizes the module layout so generic transforms live in synix.transforms and bundled memory-oriented ones in synix.ext. A 1000-line RFC document accompanies the code.

Alignment

Strong fit. DESIGN.md §3.3 establishes that cache keys must capture all inputs — this PR adds uses signatures to fingerprints, so changing a search surface invalidates downstream transforms. The depends_on vs uses split cleanly separates artifact flow from capability access, preserving the DAG's integrity as described in the domain model. The SearchSurface concept aligns with the immutable build history vision (§1.5): surfaces produce canonical state, local realizations are disposable. The TransformContext advances the Python-first principle (§4.1) by making the extension contract typed rather than a loose dict bag.

Observations

  1. [positive] validate_search_surface_uses in search_surfaces.py enforces that a transform cannot consume a surface whose sources haven't been built yet (level check). This prevents a class of silent correctness bugs and matches the DAG ordering invariant.

  2. [positive] Legacy compatibility bridge via _transform_prefers_legacy_config_dict uses signature introspection to auto-detect old config: dict transforms and passes them a plain dict. The batch build test (test_legacy_strict_dict_transform_runs_after_batch_layer) and the concurrent runner test (assert type(config) is dict) verify this. Good coverage of a risky migration path.

  3. [concern] _transform_prefers_legacy_config_dict falls back to checking param.name == "config" when there's no annotation. A user who names their parameter config but expects TransformContext would get a dict. This is fragile — consider defaulting to TransformContext when unresolvable and documenting the annotation requirement.

  4. [concern] SearchSurfaceHandle.query() opens and closes a SearchIndex connection on every call. TopicalRollup.split() calls search.query() once per topic. With many topics this is O(n) open/close cycles. The old code explicitly closed once after the loop; this regression could matter at scale. Consider caching the connection or offering a context-manager pattern.

  5. [concern] SearchSurfaceHandle is a frozen dataclass that also implements MutableMapping — wait, it implements Mapping (immutable). Good. But TransformContext implements MutableMapping, meaning transform code can mutate it (ctx["key"] = val). Since with_updates returns a new context but the original is still mutable, concurrent workers could share and mutate the same context object. The runner does copy.deepcopy for workers, but the _run_one function constructs a fresh TransformContext from worker_config, so this is likely safe in practice.

  6. [question] The module reorganization puts generic transforms in synix.transforms and memory-oriented ones in synix.ext, but both packages re-export everything from the other for compatibility. The README example imports from synix.transforms, the pipeline-api doc imports from synix.ext. Will this confuse new users about the canonical import path?

  7. [positive] Comprehensive test coverage: unit tests for TransformContext.config filtering, DAG registration, surface validation (level ordering, mismatched declarations, projection-vs-surface confusion), integration test for surface materialization, e2e pipeline tests updated. The test_search_surface_waits_for_all_source_layers test directly validates the readiness check.

  8. [nit] TopicalRollup.split() calls self.get_search_surface(ctx, required=bool(self.uses)) — when self.uses is a non-empty list this is required=True, which now makes a missing surface a hard error rather than the old silent fallback. The test test_topical_rollup_requires_declared_surface_when_unavailable confirms this, but this is a behavioral breaking change for anyone who previously relied on the fallback. Documented nowhere as a migration note.

  9. [nit] Golden output files all gain Surfaces: 0 lines for existing templates. Correct mechanically, but slightly noisy for users reading plan output when no surfaces are declared.

Verdict

This is a well-structured architectural step that replaces an implicit file-path convention with an explicit, validated, fingerprinted capability model — a clear improvement to build correctness and the extension contract. Merge after addressing the connection-per-query performance regression and adding a migration note about the TopicalRollup fallback behavior change.

Review parameters
  • Model: claude-opus-4-6
  • Context: README.md, DESIGN.md, synix.dev, PR diff
  • Diff size: 5,830 lines
  • Prompt: .github/prompts/claude_review.md
  • Timestamp: 2026-03-06T22:24:35Z

@marklubin marklubin merged commit 514d27b into main Mar 6, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant