-
Notifications
You must be signed in to change notification settings - Fork 0
feat: implement multi-tenant space architecture #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Remove build-only profile from worker service so it auto-builds on docker compose up - Add worker image validation on gateway startup with helpful error messages - Add upgrade instructions to README template (pull images, rebuild worker, update env vars) - Fixes issue where users see 'No such image' error on first message after init 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Migrate from thread-scoped volumes and user-scoped credentials to
space-based multi-tenant architecture where users/groups share
persistent workspace across all their threads.
Key changes:
- Volumes: peerbot-workspace-{threadId} → peerbot-workspace-{spaceId}
- Credentials: mcp:credential:{userId}:{mcpId} → mcp:credential:{spaceId}:{mcpId}
- Claude credentials: claude:credential:{userId} → claude:credential:{spaceId}
Space ID format:
- DM/User context: user-{hash8} (hash of {platform}:user:{userId})
- Group/Channel context: group-{hash8} (hash of {platform}:group:{channelId})
Changes:
- Add space-resolver module for consistent spaceId generation
- Update all credential stores to use spaceId
- Update OAuth flows to include spaceId in state
- Update deployment managers to use space-based volumes
- Update worker token to include spaceId
- Add WhatsApp platform adapter
- Clean up unused files and dependencies
|
Claude finished @buremba's task —— View job PR Review: Multi-Tenant Space ArchitectureI've completed my comprehensive review of this PR implementing the migration from thread-scoped to space-based multi-tenant architecture. Here's my detailed feedback: Review Tasks:
✅ Strengths1. Well-Designed Space Architecture
2. Comprehensive Migration
3. Security Considerations
|
Mesa DescriptionTL;DRMigrates from thread-scoped volumes and user-scoped credentials to a space-based multi-tenant architecture where users/groups share persistent workspaces across all their threads. What changed?
Description generated by Mesa. Update settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performed full review of 9c33352...16b8723
Analysis
-
Hash Collision Risk: The 8-character SHA256 space ID (32 bits entropy) creates significant collision probability at scale (~1% at 77,000 spaces), potentially causing unauthorized workspace sharing, credential leakage, and privacy violations. Increase hash length to 12-16 characters or implement collision detection.
-
Unauthorized Credential Sharing: Group space design allows ALL members to use ANY member's authenticated credentials without consent, violating OAuth provider terms and creating security vulnerabilities when group membership changes. Implement user-scoped credentials or explicit consent flows.
-
Missing Volume Lifecycle Management: Space PVCs are never deleted, causing unbounded storage growth, orphaned resources, and potential compliance issues. Implement cleanup policies based on activity.
-
Broken Backwards Compatibility: No migration logic exists for user-scoped credentials, forcing users to re-authenticate without explanation and losing existing credentials. Implement credential migration or dual-lookup during transition.
-
Brittle Platform Detection: Space resolution relies on string prefix/suffix matching for platform detection without proper validation or error handling for unexpected formats.
Tip
Help
Slash Commands:
/review- Request a full code review/review latest- Review only changes since the last review/describe- Generate PR description. This will update the PR body or issue comment depending on your configuration/help- Get help with Mesa commands and configuration options
0 files reviewed | 4 comments | Edit Agent Settings • Read Docs
| // Don't throw - deployment deletion should succeed even if PVC cleanup fails | ||
| } | ||
| } | ||
| // NOTE: Space PVCs are NOT deleted on deployment deletion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Medium: Unbounded PVC Growth
Space PVCs are intentionally never deleted (for persistence across threads), but this creates operational and compliance concerns:
Issues:
- Unbounded Storage Growth: Every unique space creates a PVC that persists indefinitely
- Orphaned Resources: Inactive spaces continue consuming storage resources
- Data Retention: May violate data retention policies (GDPR, etc.) if user data persists after account deletion
- Cost: Cloud storage costs grow linearly with number of unique spaces ever created
Recommendations:
- Implement PVC cleanup policy based on last access timestamp
- Add monitoring for PVC count and total storage usage
- Create admin API/tool to list and cleanup inactive space PVCs
- Document data retention policy and cleanup procedures
- Consider adding PVC expiration annotations that can be extended on access
Example cleanup policy: Delete PVCs not accessed in 90 days (configurable)
Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: buremba/peerbot#91
File: packages/gateway/src/orchestration/impl/k8s-deployment.ts#L713
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.
Feedback:
**Medium: Unbounded PVC Growth**
Space PVCs are intentionally never deleted (for persistence across threads), but this creates operational and compliance concerns:
**Issues:**
1. **Unbounded Storage Growth**: Every unique space creates a PVC that persists indefinitely
2. **Orphaned Resources**: Inactive spaces continue consuming storage resources
3. **Data Retention**: May violate data retention policies (GDPR, etc.) if user data persists after account deletion
4. **Cost**: Cloud storage costs grow linearly with number of unique spaces ever created
**Recommendations:**
1. Implement PVC cleanup policy based on last access timestamp
2. Add monitoring for PVC count and total storage usage
3. Create admin API/tool to list and cleanup inactive space PVCs
4. Document data retention policy and cleanup procedures
5. Consider adding PVC expiration annotations that can be extended on access
Example cleanup policy: Delete PVCs not accessed in 90 days (configurable)
| // Generate worker authentication token with platform info | ||
| // Check both top-level teamId (WhatsApp) and platformMetadata.teamId (Slack) | ||
| const teamId = messageData.teamId || platformMetadata?.teamId; | ||
| const spaceId = messageData.spaceId || threadId; // Fall back to threadId for backwards compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High: Missing Credential Migration Path
This fallback to threadId provides backwards compatibility for deployment naming, but there's no corresponding migration logic for existing credentials.
Problem:
Existing credentials stored as:
claude:credential:{userId}mcp:credential:{userId}:{mcpId}
Will NOT be accessible after this deployment because the new code looks for:
claude:credential:{spaceId}mcp:credential:{spaceId}:{mcpId}
Impact:
- All users will lose access to their existing OAuth credentials
- Users must re-authenticate without explanation
- Poor user experience during rollout
Recommendations:
- Implement credential migration script that runs during deployment
- Add dual-lookup fallback: check spaceId first, fall back to userId for transition period
- Add migration timeline and deprecation warnings
- Consider adding a migration flag to track which credentials have been migrated
Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: buremba/peerbot#91
File: packages/gateway/src/orchestration/base-deployment-manager.ts#L225
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.
Feedback:
**High: Missing Credential Migration Path**
This fallback to `threadId` provides backwards compatibility for deployment naming, but there's no corresponding migration logic for existing credentials.
**Problem:**
Existing credentials stored as:
- `claude:credential:{userId}`
- `mcp:credential:{userId}:{mcpId}`
Will NOT be accessible after this deployment because the new code looks for:
- `claude:credential:{spaceId}`
- `mcp:credential:{spaceId}:{mcpId}`
**Impact:**
- All users will lose access to their existing OAuth credentials
- Users must re-authenticate without explanation
- Poor user experience during rollout
**Recommendations:**
1. Implement credential migration script that runs during deployment
2. Add dual-lookup fallback: check spaceId first, fall back to userId for transition period
3. Add migration timeline and deprecation warnings
4. Consider adding a migration flag to track which credentials have been migrated
| * Uses first 8 chars of SHA256 for uniqueness with K8s label compatibility. | ||
| */ | ||
| export function hashPlatformId(id: string): string { | ||
| return createHash("sha256").update(id).digest("hex").substring(0, 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Hash Collision Risk
Using only 8 characters (32 bits) of SHA256 creates a realistic collision risk. By the birthday paradox, collision probability reaches ~1% at around 77,000 spaces, and 50% at ~400,000 spaces.
For a production system with thousands of users across multiple platforms, collisions could cause:
- Different users/groups sharing the same workspace volume
- Credential leakage between unrelated spaces
- Data privacy violations
Recommendations:
- Increase hash length to at least 12-16 characters (48-64 bits)
- Implement collision detection and retry with salt
- Add monitoring/alerting for hash collisions
- Consider including a timestamp component to reduce collision probability
Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: buremba/peerbot#91
File: packages/gateway/src/spaces/space-resolver.ts#L20
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.
Feedback:
**Critical: Hash Collision Risk**
Using only 8 characters (32 bits) of SHA256 creates a realistic collision risk. By the birthday paradox, collision probability reaches ~1% at around 77,000 spaces, and 50% at ~400,000 spaces.
For a production system with thousands of users across multiple platforms, collisions could cause:
- Different users/groups sharing the same workspace volume
- Credential leakage between unrelated spaces
- Data privacy violations
**Recommendations:**
1. Increase hash length to at least 12-16 characters (48-64 bits)
2. Implement collision detection and retry with salt
3. Add monitoring/alerting for hash collisions
4. Consider including a timestamp component to reduce collision probability
| * Detect if context represents a group/channel based on platform heuristics. | ||
| * Use when isGroup is not explicitly available. | ||
| */ | ||
| export function isGroupContext(platform: string, channelId: string): boolean { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Medium: Brittle Platform Detection Heuristics
Using string prefix/suffix matching for group detection is fragile:
Risks:
- Slack/WhatsApp could change their ID format in the future
- No validation for unexpected formats
- Silent failures could cause incorrect space resolution (users in groups treated as DMs or vice versa)
- Other platforms may not follow predictable patterns
Specific Concerns:
- Slack: What about private channels, threads, or new channel types?
- WhatsApp: What about broadcast lists or status updates?
- Future platforms: No extensibility mechanism
Recommendations:
- Add validation and error logging for unexpected channel ID formats
- Make platform detection configurable/extensible
- Add explicit
isGroupparameter to the context (don't rely on heuristics) - Return an error/warning when heuristic detection is uncertain
- Add unit tests for various channel ID formats
Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: buremba/peerbot#91
File: packages/gateway/src/spaces/space-resolver.ts#L52
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.
Feedback:
**Medium: Brittle Platform Detection Heuristics**
Using string prefix/suffix matching for group detection is fragile:
**Risks:**
1. Slack/WhatsApp could change their ID format in the future
2. No validation for unexpected formats
3. Silent failures could cause incorrect space resolution (users in groups treated as DMs or vice versa)
4. Other platforms may not follow predictable patterns
**Specific Concerns:**
- Slack: What about private channels, threads, or new channel types?
- WhatsApp: What about broadcast lists or status updates?
- Future platforms: No extensibility mechanism
**Recommendations:**
1. Add validation and error logging for unexpected channel ID formats
2. Make platform detection configurable/extensible
3. Add explicit `isGroup` parameter to the context (don't rely on heuristics)
4. Return an error/warning when heuristic detection is uncertain
5. Add unit tests for various channel ID formats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
https://github.com/buremba/peerbot/blob/16b8723b218d3fd3bc4af0b83cc1600030350b9c/packages/gateway/src/orchestration/message-consumer.ts#L124-L125
Check credentials by spaceId instead of userId
Claude credentials are now keyed by spaceId (claude:credential:{spaceId}), but the orchestrator still calls hasCredentials(data.userId) here; because spaceIds are hashed (user-…/group-…), this check will always fail even when a space is already authenticated, so every message will trigger the auth prompt and existing credentials are never reused. Use data.spaceId (with a threadId fallback if needed) for the credential lookup.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| platformMetadata: { | ||
| jid: context.chatJid, | ||
| senderJid: context.senderJid, | ||
| senderE164: context.senderE164, | ||
| senderName: context.senderName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include spaceId in WhatsApp metadata for auth flow
The WhatsApp auth adapter derives the space from platformMetadata.spaceId and otherwise falls back to channelId; this payload’s platformMetadata omits spaceId, so OAuth state/credentials are stored under the raw chat JID instead of the hashed spaceId. Subsequent lookups use the hashed spaceId and won’t find those credentials, resulting in an auth loop for WhatsApp chats. Add spaceId to platformMetadata (or recompute it in the adapter) so storage and lookup keys match.
Useful? React with 👍 / 👎.
| // Use spaceId for volume naming (shared across threads in same space) | ||
| // Fall back to threadId for backwards compatibility | ||
| const spaceId = messageData?.spaceId || threadId; | ||
|
|
||
| // Determine if running in Docker and resolve project paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space-scoped volumes still yield thread-scoped workdirs
Docker deployments now mount a space-scoped volume at /workspace, but the worker still derives its working directory from THREAD_ID/sessionKey (see WorkspaceManager), so each new thread in the same space will use a fresh /workspace/<threadId> subdir and won’t see files from previous threads. That undermines the stated goal of a shared workspace across threads; consider switching the worker workspace directory to spaceId when SPACE_ID is provided.
Useful? React with 👍 / 👎.
Summary
Migrate from thread-scoped volumes and user-scoped credentials to space-based multi-tenant architecture where users/groups share persistent workspace across all their threads.
Key changes:
peerbot-workspace-{threadId}→peerbot-workspace-{spaceId}mcp:credential:{userId}:{mcpId}→mcp:credential:{spaceId}:{mcpId}claude:credential:{userId}→claude:credential:{spaceId}Space ID format:
user-{hash8}{platform}:user:{userId}group-{hash8}{platform}:group:{channelId}Changes:
space-resolvermodule for consistent spaceId generationTest plan
Note
Implements space-scoped multi-tenancy and modernizes platform/infra tooling.
spaceIdacross core types and module interfaces; switch credential stores toclaude:credential:{spaceId}andmcp:credential:{spaceId}:{mcpId}; includespaceIdin worker tokens and OAuth state/flows; update MCP config/status to be space-scoped@whiskeysockets/baileys,qrcode-terminal); docs reflect WhatsApp as primary platform8118; add startup/liveness/readiness probes, secrets checksum annotation; stricter NetworkPolicies enforcing gateway-proxied egress and worker isolation; namespace-scoped worker RBAC; optional metricsServiceMonitor; values tuned (securityContext read-only, Gatekeeper stubs, WhatsApp/allowlist settings)tsxrunner; sidecar-first dev flow, Makefile streamlined (build-worker, process mgmt); remove docker-compose.dev; logging fallback in core (USE_SIMPLE_LOGGER) and safer Sentry initWritten by Cursor Bugbot for commit 16b8723. This will update automatically on new commits. Configure here.