Skip to content

Conversation

@dmartinol
Copy link
Collaborator

Addresses #2287

Signed-off-by: Daniele Martinoli <dmartino@redhat.com>
@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 53.65%. Comparing base (f145cfc) to head (f531c34).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2301      +/-   ##
==========================================
+ Coverage   53.57%   53.65%   +0.07%     
==========================================
  Files         238      239       +1     
  Lines       30658    30704      +46     
==========================================
+ Hits        16424    16473      +49     
+ Misses      13064    13063       -1     
+ Partials     1170     1168       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ChrisJBurns
Copy link
Collaborator

Thanks for this @dmartinol I'll mark this to review when ive got a bit of downtime!

Copy link
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple of comments inline but in general I think this looks great.

Also tagging @rdimitrov @ChrisJBurns for visibility, but my take is let's clear the minor comments, merge, file issues and let's go.


**Controller:**
- Deploy and manage Registry API (`Deployment`, `Service`)
- Create and update configuration `ConfigMap` from `MCPRegistry` spec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the configuration of the registry server right?

### Configuration Management

**Flow:**
1. Controller creates `ConfigMap` `<registry-name>-config` containing `MCPRegistry` spec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the MCPRegistry spec verbatim, but a config file of the registry server, correct?


**File-based (Phase 1):**
- Registry API stores data at path defined by `--storage-path` flag
- Controller sets `--storage-path /data` using mounted `emptyDir` volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should start with file-based. We don't have to document this storage if it's temporary.

**File-based (Phase 1):**
- Registry API stores data at path defined by `--storage-path` flag
- Controller sets `--storage-path /data` using mounted `emptyDir` volume
- Data format: ToolHive registry JSON schema (unchanged from current)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason to not make this a permanent flag (but I agree it is a reasonable middle step)

- Data format: ToolHive registry JSON schema (unchanged from current)
- Persistence: Data survives pod restarts, lost on deployment deletion

**Database-backed (Future):** Deferred to separate proposal for multi-replica support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

- Registry API exposes `GET /api/v1/status` returning sync history, stored in the configured storage path
- Controller polls every 60 seconds and updates `syncStatus`
- Pros: Simple, no new CRDs, works for CLI and Kubernetes
- Cons: 60-second status latency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can even start with a middle ground if we want to move to a CRD which I think is reasonable - have the controller poll and create the CRD, later we can move to the API server creating the CRD with no breakage.

### Manual Sync

- Registry API exposes `POST /api/v1/sync` endpoint
- Controller calls endpoint when `toolhive.stacklok.dev/sync-trigger` annotation changes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sync-trigger annotation is on a registry (one instance), so the sync should probably accept a name as well?


**Goal:** Full-functioning solution with sync logic in `thv-registry-api`, maintaining backward compatibility with existing MCPRegistry resources.

**Task Order Rationale:** Registry API must be enhanced first before operator changes, to avoid breaking existing deployments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

- Implement `--storage-path` flag for registry data location
- Add file-based storage backend matching current ConfigMap JSON format
- Atomic file writes (temp file + rename pattern)
- Load existing data on startup if present
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two can be done in parallel. I would also add another task here in parallel which is design the private API which will include the sync.

7. **Add sync status endpoint**
- Implement `GET /api/v1/status` returning sync history
- Return format matching proposed JSON schema
- Read from sync history file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the shape of the API should come sooner even if it returns placeholders.

- Copy `cmd/thv-operator/pkg/sync` to `thv-registry-api`
- Adapt sync logic to use file-based storage instead of ConfigMap
- Implement retry logic (5-minute interval, unlimited attempts)
- Remove Kubernetes-specific dependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note that I think should be important is that we didn't bother with concurrency much in the operator. We probably should in the server. Should this be an explicit ask in the issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants