-
Notifications
You must be signed in to change notification settings - Fork 63
[feat, multicast] Add IP pool multicast support #9084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This work introduces multicast IP pool capabilities to support external multicast traffic routing through the rack's switching infrastructure. Includes: - Add IpPoolType enum (unicast/multicast) with unicast as default - Add multicast pool fields: switch_port_uplinks (UUID[]), mvlan (VLAN ID) - Add database migration (multicast-support/up01.sql) with new columns and indexes - Add ASM/SSM range validation for multicast pools to prevent mixing - Add pool type-aware resolution for IP allocation - Add custom deserializer for switch port uplinks with deduplication - Update external API params/views for multicast pool configuration - Add SSM constants (IPV4_SSM_SUBNET, IPV6_SSM_FLAG_FIELD) for validation Database schema updates: - ip_pool table: pool_type, switch_port_uplinks, mvlan columns - Index on pool_type for efficient filtering - Migration preserves existing pools as unicast type by default This provides the foundation for multicast group functionality while maintaining full backward compatibility with existing unicast pools. References (for review): - RFD 488: https://rfd.shared.oxide.computer/rfd/488 - Dendrite PRs (based on recency): * oxidecomputer/dendrite#132 * oxidecomputer/dendrite#109 * oxidecomputer/dendrite#14
b692148 to
98bf71e
Compare
Introduces end-to-end multicast group support across control plane and sled-agent, integrated with IP pool extensions required
for supporting multicast workflows. This work enables project-scoped multicast groups with lifecycle-driven dataplane programming
and exposes an API for operating multicast groups over instances.
Highlights:
- DB: new multicast_group tables; member lifecycle management
- API: multicast group/member CRUD; source IP validation; VPC/project hierarchy integration with default VNI fallback
- Control plane: RPW reconcilers for groups/members; sagas for dataplane updates atomically at the group level; instance lifecycle hooks and piggybacking
- Dataplane: Dendrite DPD switch programming via trait abstraction; DPD client used in tests
- Sled agent: multicast-aware instance management; network interface configuration for multicast traffic; cross-version testing; OPTE stubs present
- Tests: comprehensive integration suites under nexus/tests/integration_tests/multicast/
Components:
- Database schema: external and underlay multicast groups; member/instance association tables
- Control plane modules: multicast group management, member lifecycle, dataplane abstraction; RPW reconcilers to ensure convergence
- API layer: endpoints and validation; default-VNI semantics when VPC not provided
- Sled agent: OPTE stubs and compatibility shims for older agents
Workflows Implemented:
1. Instance lifecycle integration:
- "Create" -> resolve VPC/VNI (or default), validate source IPs, create memberships, enqueue group ensure RPW
- "Start" -> program dataplane via ensure/update sagas; activate member flows after switch ack
- "Stop" -> deactivate dataplane membership; retain DB membership for fast restart
- "Delete" -> remove instance memberships; group deletion is explicit
- "Migrate" -> deactivate on source sled; activate on target; idempotent with ordering guarantees
- Restart/recovery -> RPWs reconcile desired state; compensations clean up partial programming
2. RPW reconciliation:
- ensure dataplane switches match database state
- handle sled migrations and state transitions
- Eventual consistency with retry logic
Migrations:
- Apply schema changes in schema/crdb/multicast-support/up01.sql (and update dbinit.sql)
- Bump schema versions accordingly
API/Compatibility:
- OpenAPI updated: openapi/nexus.json, openapi/sled-agent/sled-agent-5.0.0-89f1f7.json
- Regenerate clients where applicable
References:
- RFD 488: https://rfd.shared.oxide.computer/rfd/488
- IP Pool extensions: #9084
- Dendrite PRs (based on recency):
* oxidecomputer/dendrite#132
* oxidecomputer/dendrite#109
* oxidecomputer/dendrite#14
Follow-ups include:
- OPTE integration
- commtest extension
- omdb commands are tracked in issues
- pool and group stats
Introduces end-to-end multicast group support across control plane and sled-agent, integrated with IP pool extensions required
for supporting multicast workflows. This work enables project-scoped multicast groups with lifecycle-driven dataplane programming
and exposes an API for operating multicast groups over instances.
Highlights:
- DB: new multicast_group tables; member lifecycle management
- API: multicast group/member CRUD; source IP validation; VPC/project hierarchy integration with default VNI fallback
- Control plane: RPW reconcilers for groups/members; sagas for dataplane updates atomically at the group level; instance lifecycle hooks and piggybacking
- Dataplane: Dendrite DPD switch programming via trait abstraction; DPD client used in tests
- Sled agent: multicast-aware instance management; network interface configuration for multicast traffic; cross-version testing; OPTE stubs present
- Tests: comprehensive integration suites under nexus/tests/integration_tests/multicast/
Components:
- Database schema: external and underlay multicast groups; member/instance association tables
- Control plane modules: multicast group management, member lifecycle, dataplane abstraction; RPW reconcilers to ensure convergence
- API layer: endpoints and validation; default-VNI semantics when VPC not provided
- Sled agent: OPTE stubs and compatibility shims for older agents
Workflows Implemented:
1. Instance lifecycle integration:
- "Create" -> resolve VPC/VNI (or default), validate source IPs, create memberships, enqueue group ensure RPW
- "Start" -> program dataplane via ensure/update sagas; activate member flows after switch ack
- "Stop" -> deactivate dataplane membership; retain DB membership for fast restart
- "Delete" -> remove instance memberships; group deletion is explicit
- "Migrate" -> deactivate on source sled; activate on target; idempotent with ordering guarantees
- Restart/recovery -> RPWs reconcile desired state; compensations clean up partial programming
2. RPW reconciliation:
- ensure dataplane switches match database state
- handle sled migrations and state transitions
- Eventual consistency with retry logic
Migrations:
- Apply schema changes in schema/crdb/multicast-group-support/up01.sql (and update dbinit.sql)
- Bump schema versions accordingly
API/Compatibility:
- OpenAPI updated: openapi/nexus.json, openapi/sled-agent/sled-agent-5.0.0-89f1f7.json
- Contains a version change (to v5) as InstanceEnsureBody has been modified to
include multicast_groups associated with an instance in the
underlying sled config
- Regenerate clients where applicable
References:
- RFD 488: https://rfd.shared.oxide.computer/rfd/488
- IP Pool extensions: #9084
- Dendrite PRs (based on recency):
* oxidecomputer/dendrite#132
* oxidecomputer/dendrite#109
* oxidecomputer/dendrite#14
Follow-ups include:
- OPTE integration
- commtest extension
- omdb commands are tracked in issues
- pool and group stats
Introduces end-to-end multicast group support across control plane and sled-agent, integrated with IP pool extensions required
for supporting multicast workflows. This work enables project-scoped multicast groups with lifecycle-driven dataplane programming
and exposes an API for operating multicast groups over instances.
Highlights:
- DB: new multicast_group tables; member lifecycle management
- API: multicast group/member CRUD; source IP validation; VPC/project hierarchy integration with default VNI fallback
- Control plane: RPW reconcilers for groups/members; sagas for dataplane updates atomically at the group level; instance lifecycle hooks and piggybacking
- Dataplane: Dendrite DPD switch programming via trait abstraction; DPD client used in tests
- Sled agent: multicast-aware instance management; network interface configuration for multicast traffic; cross-version testing; OPTE stubs present
- Tests: comprehensive integration suites under nexus/tests/integration_tests/multicast/
Components:
- Database schema: external and underlay multicast groups; member/instance association tables
- Control plane modules: multicast group management, member lifecycle, dataplane abstraction; RPW reconcilers to ensure convergence
- API layer: endpoints and validation; default-VNI semantics when VPC not provided
- Sled agent: OPTE stubs and compatibility shims for older agents
Workflows Implemented:
1. Instance lifecycle integration:
- "Create" -> resolve VPC/VNI (or default), validate source IPs, create memberships, enqueue group ensure RPW
- "Start" -> program dataplane via ensure/update sagas; activate member flows after switch ack
- "Stop" -> deactivate dataplane membership; retain DB membership for fast restart
- "Delete" -> remove instance memberships; group deletion is explicit
- "Migrate" -> deactivate on source sled; activate on target; idempotent with ordering guarantees
- Restart/recovery -> RPWs reconcile desired state; compensations clean up partial programming
2. RPW reconciliation:
- ensure dataplane switches match database state
- handle sled migrations and state transitions
- Eventual consistency with retry logic
Migrations:
- Apply schema changes in schema/crdb/multicast-group-support/up01.sql (and update dbinit.sql)
- Bump schema versions accordingly
API/Compatibility:
- OpenAPI updated: openapi/nexus.json, openapi/sled-agent/sled-agent-5.0.0-89f1f7.json
- Contains a version change (to v5) as InstanceEnsureBody has been modified to
include multicast_groups associated with an instance in the
underlying sled config
- Regenerate clients where applicable
References:
- RFD 488: https://rfd.shared.oxide.computer/rfd/488
- IP Pool extensions: #9084
- Dendrite PRs (based on recency):
* oxidecomputer/dendrite#132
* oxidecomputer/dendrite#109
* oxidecomputer/dendrite#14
Follow-ups include:
- OPTE integration
- commtest extension
- omdb commands are tracked in issues
- pool and group stats
|
|
0e7ad02 to
bcb4fc6
Compare
rcgoodfellow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @zeeshanlakhani. Comments follow from a full read through.
rcgoodfellow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @internet-diglett can you take a look in terms of overall omicron integration?
internet-diglett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! There were a few things that stuck out to me where I know you are probably just working around existing code, but it might be worth updating when we / if can do so.
Also, we add DB a constraint and specialize the range error.
|
@internet-diglett made that change for ranges if you wanna 👀. Updating from |
Implements end-to-end multicast networking across Omicron's control plane and sled-agent, integrated with IP pool extensions from #9084. Closes #8242. TL;DR: > Implements fleet-wide multicast groups across the control plane and sled-agent, integrated with IP pool extensions (#9084). Adds a reconciliation worker (RPW), inventory-based sled→switch-port mapping, a multi-switch multicast dataplane trait, and paired external/underlay groups for NAT and Source-Specific Multicast (SSM). Introduces fleet-scoped auth and a 3-state membership lifecycle; requires schema `v209` and sled-agent API `v7`; feature is disabled by default. *Highlights*: - An RPW for reconciling groups and instance members (ensuring dataplane state matches DB) - Inventory-based sled→switch-port mapping with validation tests - A multicast-focused dataplane trait separating control plane logic from Dendrite/DPD; works across multiple switches - Bifurcated architecture with paired external/underlay groups for NAT-based forwarding - 3-state instance member lifecycle ("Joining" → "Joined" → "Left") with reactivation support - Fleet-scoped authorization model allowing cross-project multicast - New DB tables: multicast_group, underlay_multicast_group, multicast_group_member - External groups: Customer-facing IPv4/IPv6 addresses from IP pools with SSM support - Underlay groups: Admin-scoped IPv6 (ff04::/16); default allocation from fixed ff04::/64 for internal rack forwarding - Feature flag and reconciler/cache settings exist and default to disabled/safe values - Member states: "Joining"/"Joined"/"Left" with soft-delete/mark-for-removal for instance lifecycle - Group states: "Creating"/"Active"/"Deleting"/"Deleted" for RPW processing - sled-agent: API `v7` with multicast join/leave endpoints - Multicast-aware instance management - Network interface configuration for multicast traffic - OPTE integration stubbed pending oxidecomputer/opte#847 - Inventory / Port correlation - Validates baseboard identifiers match between sleds and SPs - Required for multicast reconciler to map `sled_id` → rear switch-ports (backplane) for instances - `mvlan`: External groups support an optional Multicast VLAN for (eventual) upstream egress - Updates to instance sagas as Nexus passes memberships to sled-agent via `InstanceSledLocalConfig.multicast_groups` *API Endpoints*: - GET /v1/multicast-groups: List fleet multicast groups - POST /v1/multicast-groups: Create multicast group - GET /v1/multicast-groups/{group}: View group details - PUT /v1/multicast-groups/{group}: Update group (name, sources) - DELETE /v1/multicast-groups/{group}: Delete group - GET /v1/multicast-groups/{group}/members: List group members - POST /v1/multicast-groups/{group}/members: Add instance to group - DELETE /v1/multicast-groups/{group}/members/{instance}: Remove instance from group - GET /v1/instances/{instance}/multicast-groups: List groups for an instance - PUT /v1/instances/{instance}/multicast-groups/{group}: Join instance to group - DELETE /v1/instances/{instance}/multicast-groups/{group}: Leave group - GET /v1/system/multicast-groups/by-ip/{address}: Lookup group by IP address The instance-scoped endpoints provide an alternative interface for the same join/leave operations, and there's also the system-level IP lookup endpoint. *New Sagas*: - `multicast_group_dpd_ensure`: Ties together external/underlay creation of groups on all switches - `multicast_group_dpd_update`: Updates group configuration across switches *Breaking Changes*: - sled-agent API version bump from `v6` to `v7` - New required configuration in Nexus (multicast.enabled flag, reconciler period, and cache TTL settings) - Schema migration required (`v208` → `v209`) *Migration Notes*: - Multicast as a feature is disabled by default for safe rollout - Multicast endpoints are marked as "experimental" *References*: - RFD 488: https://rfd.shared.oxide.computer/rfd/488 - IP Pool extensions: #9084 - Dendrite PRs (based on recency): - oxidecomputer/dendrite#132 - oxidecomputer/dendrite#109 - oxidecomputer/dendrite#14
This work introduces multicast IP pool capabilities to support external multicast traffic routing through the rack's switching infrastructure.
Closes #8217.
Includes:
Database schema updates:
This provides the foundation for multicast group functionality while maintaining full backward compatibility with existing unicast pools.
References (for review):
TODOs: