Skip to content

Add content-aware overlap detection for agent contributions #765

@james-in-a-box

Description

@james-in-a-box

Context

The Entireio CLI implements content-aware overlap detection that goes beyond simple filename matching:

  • Uses blob hash comparison instead of reading full file contents
  • Distinguishes between genuine user edits and "reverted and replaced" scenarios
  • New files require exact blob hash matches to count as overlap
  • Modified files always count as overlap
  • Returns list of files with remaining agent changes after partial commits

This prevents false positives when a user reverts agent changes and writes their own version.

Proposal

Add content-aware overlap detection to our checkpoint/attribution system:

  • When comparing agent work to committed code, use blob hashes for efficiency
  • Detect when human changes overwrite vs. extend agent work
  • Track which agent contributions survived into the final commit
  • Use this data to improve agent effectiveness metrics

This complements the attribution tracking work (separate issue) and would help answer "how much of the agent's work was actually used?"

Reference

See entireio/cli — content overlap detection in the strategy package.

Authored-by: egg

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions