-
Notifications
You must be signed in to change notification settings - Fork 134
Proposal to refactor the MCP Registry controller #2301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Daniele Martinoli <dmartino@redhat.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2301 +/- ##
==========================================
+ Coverage 53.57% 53.65% +0.07%
==========================================
Files 238 239 +1
Lines 30658 30704 +46
==========================================
+ Hits 16424 16473 +49
+ Misses 13064 13063 -1
+ Partials 1170 1168 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for this @dmartinol I'll mark this to review when ive got a bit of downtime! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a couple of comments inline but in general I think this looks great.
Also tagging @rdimitrov @ChrisJBurns for visibility, but my take is let's clear the minor comments, merge, file issues and let's go.
|
|
||
| **Controller:** | ||
| - Deploy and manage Registry API (`Deployment`, `Service`) | ||
| - Create and update configuration `ConfigMap` from `MCPRegistry` spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the configuration of the registry server right?
| ### Configuration Management | ||
|
|
||
| **Flow:** | ||
| 1. Controller creates `ConfigMap` `<registry-name>-config` containing `MCPRegistry` spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the MCPRegistry spec verbatim, but a config file of the registry server, correct?
|
|
||
| **File-based (Phase 1):** | ||
| - Registry API stores data at path defined by `--storage-path` flag | ||
| - Controller sets `--storage-path /data` using mounted `emptyDir` volume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should start with file-based. We don't have to document this storage if it's temporary.
| **File-based (Phase 1):** | ||
| - Registry API stores data at path defined by `--storage-path` flag | ||
| - Controller sets `--storage-path /data` using mounted `emptyDir` volume | ||
| - Data format: ToolHive registry JSON schema (unchanged from current) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason to not make this a permanent flag (but I agree it is a reasonable middle step)
| - Data format: ToolHive registry JSON schema (unchanged from current) | ||
| - Persistence: Data survives pod restarts, lost on deployment deletion | ||
|
|
||
| **Database-backed (Future):** Deferred to separate proposal for multi-replica support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| - Registry API exposes `GET /api/v1/status` returning sync history, stored in the configured storage path | ||
| - Controller polls every 60 seconds and updates `syncStatus` | ||
| - Pros: Simple, no new CRDs, works for CLI and Kubernetes | ||
| - Cons: 60-second status latency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can even start with a middle ground if we want to move to a CRD which I think is reasonable - have the controller poll and create the CRD, later we can move to the API server creating the CRD with no breakage.
| ### Manual Sync | ||
|
|
||
| - Registry API exposes `POST /api/v1/sync` endpoint | ||
| - Controller calls endpoint when `toolhive.stacklok.dev/sync-trigger` annotation changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sync-trigger annotation is on a registry (one instance), so the sync should probably accept a name as well?
|
|
||
| **Goal:** Full-functioning solution with sync logic in `thv-registry-api`, maintaining backward compatibility with existing MCPRegistry resources. | ||
|
|
||
| **Task Order Rationale:** Registry API must be enhanced first before operator changes, to avoid breaking existing deployments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| - Implement `--storage-path` flag for registry data location | ||
| - Add file-based storage backend matching current ConfigMap JSON format | ||
| - Atomic file writes (temp file + rename pattern) | ||
| - Load existing data on startup if present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these two can be done in parallel. I would also add another task here in parallel which is design the private API which will include the sync.
| 7. **Add sync status endpoint** | ||
| - Implement `GET /api/v1/status` returning sync history | ||
| - Return format matching proposed JSON schema | ||
| - Read from sync history file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the shape of the API should come sooner even if it returns placeholders.
| - Copy `cmd/thv-operator/pkg/sync` to `thv-registry-api` | ||
| - Adapt sync logic to use file-based storage instead of ConfigMap | ||
| - Implement retry logic (5-minute interval, unlimited attempts) | ||
| - Remove Kubernetes-specific dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note that I think should be important is that we didn't bother with concurrency much in the operator. We probably should in the server. Should this be an explicit ask in the issues?
Addresses #2287