Problem or motivation
Using SocratiCode as a centralized hub of context that every team member can use so everyone has the same base source of truth of project's context.
In team settings with a centralized SocratiCode with Qdrant instance, two workflows break:
- Long-lived feature branches — the shared
main index diverges from the branch's actual state. Re-indexing from scratch is the only option today, which is expensive (minutes for large repos).
- Multiple developers on the same branch — if two devs have the file watcher running against the same collection, they overwrite each other's local uncommitted changes. The index reflects whoever wrote last, not either developer's actual working state.
Proposed solution
Proposed model
A three-tier index hierarchy, where each tier is a Qdrant collection:
main shared, updated on merge
└── branch/{name} shared per branch, updated on push
└── head/{user} personal, diff-only, ephemeral
Branch collection — a snapshot-fork of main at branch creation time. No re-embedding, just a Qdrant collection copy:
POST /collections/{source}/snapshots
PUT /collections/{target}/snapshots/recover # target auto-created
codebase_update then only processes files that actually changed on the branch.
Head collection — a snapshot-fork of the branch collection, personal per developer. The file watcher writes only here, scoped to git status --porcelain (locally modified files only). Discarded and re-forked from the branch collection on each push.
Naming convention
{project_id}__branch__{branch_name}
{project_id}__branch__{branch_name}__head__{user_id}
user_id could default to git config user.email sanitized, overridable via a SOCRATICODE_USER_ID env var.
Search behavior
When a head collection exists, codebase_search does a union search:
- Query
__head__ and __branch__ in parallel
- Deduplicate by file path —
__head__ wins on conflict
This gives the developer accurate context for files they've touched locally, and shared branch context for everything else. As a simpler v1 alternative, explicit PROJECT_ID switching (no union search) would already solve the collision problem.
Lifecycle
branch created → fork main → branch collection
worktree added → fork branch → head collection (per developer)
git push → CI codebase_update on branch collection
→ head collections discarded + re-forked
PR merged → branch collection removed
→ main collection updated
Questions
- Is the Qdrant snapshot API accessible in the managed Docker setup?
- Preference on surfacing this as MCP tools, CLI subcommands, or both?
- Does the union search approach feel right for v1, or start with explicit
PROJECT_ID switching?
- Any concern on snapshot performance at 40M+ line scale?
Alternatives considered
No response
Area
Search
Additional context
No response
Checklist
Problem or motivation
Using SocratiCode as a centralized hub of context that every team member can use so everyone has the same base source of truth of project's context.
In team settings with a centralized SocratiCode with Qdrant instance, two workflows break:
mainindex diverges from the branch's actual state. Re-indexing from scratch is the only option today, which is expensive (minutes for large repos).Proposed solution
Proposed model
A three-tier index hierarchy, where each tier is a Qdrant collection:
Branch collection — a snapshot-fork of
mainat branch creation time. No re-embedding, just a Qdrant collection copy:codebase_updatethen only processes files that actually changed on the branch.Head collection — a snapshot-fork of the branch collection, personal per developer. The file watcher writes only here, scoped to
git status --porcelain(locally modified files only). Discarded and re-forked from the branch collection on each push.Naming convention
user_idcould default togit config user.emailsanitized, overridable via aSOCRATICODE_USER_IDenv var.Search behavior
When a head collection exists,
codebase_searchdoes a union search:__head__and__branch__in parallel__head__wins on conflictThis gives the developer accurate context for files they've touched locally, and shared branch context for everything else. As a simpler v1 alternative, explicit
PROJECT_IDswitching (no union search) would already solve the collision problem.Lifecycle
Questions
PROJECT_IDswitching?Alternatives considered
No response
Area
Search
Additional context
No response
Checklist