Skip to content

Optimization: Skip downstream recompilation if upstream output unchanged #102

@jlowin

Description

@jlowin

Summary

When an upstream document is recompiled, all downstream documents are currently marked stale and recompiled. However, if the upstream's output hash is identical to the previous compilation, downstream documents don't actually need to recompile - their inputs haven't changed.

Current Behavior

  1. Doc B is stale → recompiles → added to recompiled_uris
  2. Doc A (refs B) sees B in recompiled_uris → marked stale → recompiles
  3. This happens even if B's new output is byte-for-byte identical to the cached version

Proposed Optimization

Before adding a URI to recompiled_uris, compare the new output hash to the previous cached output hash. If identical, don't add to recompiled_uris - the downstream documents don't need to recompile because their ref content is unchanged.

# Pseudocode
if new_output_hash == cached_output_hash:
    # Output unchanged, don't propagate staleness
    pass
else:
    recompiled_uris.add(uri)

Benefits

  • Faster incremental builds when upstream changes don't affect output (e.g., whitespace, comments)
  • Reduced LLM API costs when LLM produces identical output
  • More correct caching behavior - only recompile when inputs actually change

Implementation Notes

  • Need to track output content hash in manifest (may already exist as part of source_hash or similar)
  • Check happens in compile_all() around line 349-353 in engine.py
  • Should also re-check staleness of pending documents when an upstream completes with unchanged hash

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions