Add federation to skmo#3766
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3f0763b046e041c18b43ec998692e6d3 ❌ openstack-k8s-operators-content-provider FAILURE in 10m 52s |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/735d0c0530b44e039353be5e0993611a ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 46m 16s |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f424a1444f9247a78d0afc7cb1f4660f ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 11m 03s |
f588376 to
6a74e12
Compare
…ooks Add support for Shared Keystone Multi-region OpenStack (SKMO) deployments with cross-region Barbican keystone listener: Playbooks: - prepare-leaf.yaml: Pre-stage hook that creates a TransportURL CR in the central region for the leaf's barbican-keystone-listener, copies the generated secret to the leaf namespace, extracts rootca-internal CA cert from central and adds it to the leaf's custom-ca-certs bundle, and waits for central Keystone and openstackclient readiness with retry logic - configure-leaf-listener.yaml: Post-stage hook that patches the leaf OpenStackControlPlane with the cross-region transport_url for the barbican-keystone-listener - trust-leaf-ca.yaml: Post-stage hook that extracts the leaf region's rootca-public and rootca-internal CA certs and adds them to the central region's custom-ca-certs bundle - ensure-central-ca-bundle.yaml: Ensures the central CA bundle secret exists before the leaf control plane deployment Scenario: - va-multi-skmo.yml reproducer scenario configuration - multi-namespace-skmo architecture scenario symlink Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Ade Lee <alee@redhat.com>
00dba0e to
7b69e43
Compare
…mespace SKMO scenario Add a 4th extra disk to OCP VMs in the SKMO reproducer and enable the devscripts MachineConfig-based cinder-volumes LVM VG setup: - extra_disks_num: 3 -> 4 to provide a dedicated disk (/dev/vdd) for Cinder - cifmw_devscripts_create_logical_volume: true to generate the MachineConfig that creates the cinder-volumes VG via a systemd unit at boot time - cifmw_devscripts_cinder_volume_pvs: [/dev/vdd] to target the 4th disk - cifmw_devscripts_enable_iscsi_on_ocp_nodes: true to enable iscsid on OCP nodes (required for the iSCSI target created by cinder-volume) LVMS continues to use the original three disks (/dev/vda, /dev/vdb, /dev/vdc). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Ade Lee <alee@redhat.com>
…ecret Add a new variable cifmw_federation_ca_bundle_secret_name (default: "") to the federation role. When set, hook_controlplane_config.yml merges the Keycloak CA certificate as a new key (keycloak-ca.crt) into the named secret rather than creating a separate 'keycloakca' secret. If the named secret does not yet exist it is created automatically. In merge mode the kustomization patch omits the spec.tls.caBundleSecretName op-add, since the OpenStackControlPlane CR is assumed to already reference the correct secret (e.g. custom-ca-certs in SKMO deployments). When cifmw_federation_ca_bundle_secret_name is empty the original behaviour is preserved for backward compatibility: a dedicated 'keycloakca' secret is created and the kustomization patches spec.tls.caBundleSecretName to point at it. Signed-off-by: Ade Lee <alee@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
Two bugs in run_keycloak_setup.yml:
1. The 'until' condition wrapped its expression in {{ }} delimiters,
which Ansible forbids in conditionals (causes a parse error).
2. map(attribute='metadata.labels') returns a dict per resource;
select('match', ...) cannot regex-match a dict, causing
'dict object has no attribute labels' at runtime.
Fix by removing the {{ }} and using dict2items + flatten to extract
label keys before applying the regex selector.
Signed-off-by: Ade Lee <alee@redhat.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…re writing The ansible.builtin.copy task that writes keystone_federation.yaml fails if the destination directory does not yet exist. Add an explicit ansible.builtin.file task (state: directory) immediately before the two copy tasks so the directory is created on demand. Signed-off-by: Ade Lee <alee@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
…test The customServiceConfig patch that adds 'openid' to Keystone's [auth] methods is applied during the control-plane kustomize deploy (stage 5). By the time the leaf control-plane post_stage_run hooks execute (including federation-post-deploy.yml), Keystone may not have finished reconciling with the new config. Domain/IdP/mapping/protocol creation succeed because they use the existing password auth path; only get-token.sh (which authenticates via openid) fails with HTTP 401 'unsupported method'. Add a wait-for-Ready loop on the KeystoneAPI CR at the start of hook_post_deploy.yml (retries=30, delay=20s = up to 10 min) so the auth test only runs once Keystone has restarted with federation configuration active. Signed-off-by: Ade Lee <alee@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
The kustomizations/controlplane/ directory is only consumed by the edpm_prepare / ci_kustomize flow (CRC/devscripts deployments). In the kustomize_deploy flow used by SKMO (deploy-architecture.sh), nothing reads that directory, so the keystone_federation.yaml file was written but never applied - leaving the OSCP unmodified. Add Step 6 to hook_controlplane_config.yml that: 1. Checks whether the OpenStackControlPlane CR already exists. 2. If so, patches it directly via kubernetes.core.k8s (state: patched) with the httpdCustomization, customServiceConfig (openid methods), and (in dedicated-secret mode) spec.tls.caBundleSecretName. The kustomization file is still written for backward compatibility with deployments that use edpm_prepare (CRC/devscripts flow). The direct patch is a no-op when the OSCP does not yet exist (fresh install with CRC flow), making both paths safe. Signed-off-by: Ade Lee <alee@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
When deploy-architecture.sh is re-run against an existing deployment, the federation domain, identity provider, mapping, group, project and protocol may already exist in Keystone. The plain 'openstack X create' commands fail with HTTP 409 Conflict in that case. Fix by checking for the existence of each resource with 'openstack X show' (failed_when: false, changed_when: false) before attempting to create it. The create task is only run when the show returned rc != 0 (i.e. the resource was not found). Role-add is repeated unconditionally with failed_when: false because the Keystone API makes it idempotent already. Signed-off-by: Ade Lee <alee@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
…ues template
The edpm-nodeset2-values template derives _vm_type by splitting the first
node name from the existing values.yaml (e.g. edpm-compute-0 -> compute).
It then uses _vm_type to find matching instances (startswith compute2-).
This creates a self-poisoning 3-run death spiral:
Run 1: nodes have git placeholder names (edpm-compute-0)
-> _vm_type=compute -> finds compute2-* instances -> writes real
hostnames (edpm-compute2-XXXXX-0) back to values.yaml
Run 2: nodes now have real CI hostnames (edpm-compute2-XXXXX-0)
-> _vm_type=compute2 -> searches for compute22-* (does not exist)
-> instances_names=[] -> writes nodes: null back to values.yaml
Run 3: nodes is null (Python None)
-> None | default({}) returns None (default only fires for Undefined)
-> None.keys() -> CRASH: None has no attribute keys
Fix with two changes:
1. Replace | default({}) with explicit None-safe conditional so that
an explicit YAML null does not sneak through as Python None.
2. Strip trailing digits from the derived _vm_type so that after run 1
rewrites node names, compute2 strips back to compute and the instance
lookup continues to find compute2-* entries correctly on all subsequent
runs.
Signed-off-by: Ade Lee <alee@redhat.com>
Co-Authored-By: Claude <noreply@anthropic.com>
An OpenStackDataPlaneDeployment (OSDPD) is an immutable record of a single deployment run. Once its Status.Deployed is true, the operator short-circuits reconciliation with "Already deployed" and will never re-run jobs, even if the referenced nodesets have since been updated with new content (e.g. new SSH keys, new node config). When ci-framework re-applies a deployment stage with oc apply and the OSDPD already exists from a previous run, the operator ignores it. Meanwhile the nodeset operator resets DeploymentReady=False because it detects that the nodeset"s generation has advanced since the last deployment. This produces a permanent deadlock: the nodeset waits for a deployment that will never run, and the wait condition times out after 60 minutes. The correct model is: one OSDPD per deployment *run*, not per nodeset. Fix by auto-generating a timestamp suffix (YYYYMMDDHHMMSS) once at the start of the first deployment stage and appending it to the name of every OpenStackDataPlaneDeployment resource found in the kustomize build output before applying it. The suffix is stable within a single ansible run (so both edpm-deployment and edpm-deployment2 share the same suffix) but differs across runs, producing names like: edpm-deployment-20260313215236 edpm-deployment-20260314093012 Old OSDPDs are left in place as an audit trail of past runs. The operator only acts on the new CR, so the deadlock cannot occur. The suffix can be pinned by setting cifmw_kustomize_deploy_osdpd_suffix explicitly (useful for idempotent re-runs of the same logical deployment). Leave it empty (the default) for automatic timestamp generation. Signed-off-by: Ade Lee <alee@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com> Made-with: Cursor
7b69e43 to
b0ed8a7
Compare
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/9a5b3bdb290346f4afb91921e37419c7 ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 13m 36s |
|
recheck |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/d1f9efba90624c1595998f89fea46d3e ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 13m 34s |
This adds cinder-volume and federation support to the SKMO scenario.