Skip to content

Conversation

@clarkzinzow
Copy link
Collaborator

No description provided.

evgenLevin and others added 30 commits September 4, 2024 08:42
PrometheusRules allow recording pre-defined queries. Use
`sriov_kubepoddevice` metric to add `pod|namespace` pair
to the sriov metrics.

Feature is enabled via the `METRICS_EXPORTER_PROMETHEUS_DEPLOY_RULE`
environment variable.

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
When the `metricsExporter` feature is turned off, deployed resources
should be removed. These changes fix the error:

```
│ 2024-08-28T14:07:57.699760017Z    ERROR    controller/controller.go:266    Reconciler error    {"controller": "sriovoperatorconfig", "controllerGroup": "sriovnetwork.openshift.io", "controllerKind": "SriovOperatorConfig", "SriovOperatorConfig": {"name":"default","namespace":"openshift-sriov-network-operator"},  │
│ "namespace": "openshift-sriov-network-operator", "name": "default", "reconcileID": "fa841c50-dbb8-4c4c-9ddd-b98624fd2a24", "error": "failed to delete object &{map[apiVersion:monitoring.coreos.com/v1 kind:ServiceMonitor metadata:map[name:sriov-network-metrics-exporter namespace:openshift-sriov-network-operator]  │
│ spec:map[endpoints:[map[bearerTokenFile:/var/run/secrets/kubernetes.io/serviceaccount/token honorLabels:true interval:30s port:sriov-network-metrics scheme:https tlsConfig:map[caFile:/etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt insecureSkipVerify:false serverName:sriov-network-metrics-expor │
│ ter-service.openshift-sriov-network-operator.svc]]] namespaceSelector:map[matchNames:[openshift-sriov-network-operator]] selector:map[matchLabels:map[name:sriov-network-metrics-exporter-service]]]]} with err: could not delete object (monitoring.coreos.com/v1, Kind=ServiceMonitor) openshift-sriov-network-operato │
│ r/sriov-network-metrics-exporter: servicemonitors.monitoring.coreos.com \"sriov-network-metrics-exporter\" is forbidden: User \"system:serviceaccount:openshift-sriov-network-operator:sriov-network-operator\" cannot delete resource \"servicemonitors\" in API group \"monitoring.coreos.com\" in the namespace \"ope │
│ nshift-sriov-network-operator\""}
```

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
[metrics 4/x] Metrics exporter rules
Refactor some conformance tests to use `SRIOV_NODE_AND_DEVICE_NAME_FILTER`
if the current obj as annotation and the updated doesn't we still want
to add the ones from the current object

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
When a user deletes the default SriovOperatorConfig resource and
tries to recreate it afterwards, the operator webhooks returns the error:
```
Error from server (InternalError): error when creating "/tmp/opconfig.yml": Internal error occurred: failed calling webhook "operator-webhook.sriovnetwork.openshift.io": failed to call webhook: Post "https://operator-webhook-service.openshift-sriov-network-operator.svc:443/validating-custom-resource?timeout=10s": service "operator-webhook-service" not found
```

as the webhook configuration is still present, while the Service and the DaemonSet has been deleted.

Delete all the webhook configurations when the user deletes
the default SriovOperatorConfig

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
Fix merge annotation function
Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
The bash syntax was incorrect and yielded:

    hack/env.sh: line 35: ${$RDMA_CNI_IMAGE:-}: bad substitution
Fix syntax for RDMA_CNI_IMAGE var substitution
It might happen that two SR-IOV pods, deployed on different node, are using devices
with the same PCI address. In such cases, the query suggested [1] by the sriov-network-metrics-exporter produces the error:

```

Error loading values found duplicate series for the match group {pciAddr="0000:3b:02.4"} on the right hand-side of the operation:
    [
        {
            __name__="sriov_kubepoddevice",
            container="test",
            dev_type="openshift.io/intelnetdevice",
            endpoint="sriov-network-metrics",
            instance="10.1.98.60:9110",
            job="sriov-network-metrics-exporter-service",
            namespace="cnf-4916",
            pciAddr="0000:3b:02.4",
            pod="pod-cnfdr22.telco5g.eng.rdu2.redhat.com",
            prometheus="openshift-monitoring/k8s",
            service="sriov-network-metrics-exporter-service"
        }, {
            __name__="sriov_kubepoddevice",
            container="test",
            dev_type="openshift.io/intelnetdevice",
            endpoint="sriov-network-metrics",
            instance="10.1.98.230:9110",
            job="sriov-network-metrics-exporter-service",
            namespace="cnf-4916",
            pciAddr="0000:3b:02.4",
            pod="pod-dhcp-98-230.telco5g.eng.rdu2.redhat.com",
            prometheus="openshift-monitoring/k8s",
            service="sriov-network-metrics-exporter-service"
        }
    ];many-to-many matching not allowed: matching labels must be unique on one side
```

Configure the ServiceMonitor resource to add a `node` label to all metrics.
The right query to get metrics, as updated in the PrometheusRule, will be:

```
sriov_vf_tx_packets * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice
```

Also remove `pod`,  `namespace` and `container` label from the `sriov_vf_*` metrics, as they were
wrongly set to `sriov-network-metrics-exporter-zj2n9`, `openshift-sriov-network-operator`, `kube-rbac-proxy`

[1] https://github.com/k8snetworkplumbingwg/sriov-network-metrics-exporter/blob/0f6a784f377ede87b95f31e569116ceb9775b5b9/README.md?plain=1#L38

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
When we want to use config-drive in immutable systems, very often the
config-drive is only used at boot and then umounted (e.g. ignition does
this).

Later when we want to fetch Metadata from the config drive, we actually
have to mount it.

In this PR, I'm adding similar code than coreos/ignition where we
dynamically mount the config-drive is the device was found with the
right label (config-2 or CONFIG-2 as documented in OpenStack). If the
device is found, we mount it, fetch the data and umount it.
Fixes the following shellcheck error:

    SC2068 (error): Double quote array expansions to avoid re-splitting elements.

https://www.shellcheck.net/wiki/SC2068
Fixes the following shellcheck error:

    SC2148 (error): Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

https://www.shellcheck.net/wiki/SC2148
Fixes the following shellcheck errors:

    SC2145 (error): Argument mixes string and array. Use * or separate argument.
    SC2199 (error): Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).

https://www.shellcheck.net/wiki/SC2145
https://www.shellcheck.net/wiki/SC2199

Also fixes a typo in SUPPORTED_INTERFACE_SWITCHER_MODES.
Fixes the following shellcheck error:

    SC2045 (error): Iterating over ls output is fragile. Use globs.

https://www.shellcheck.net/wiki/SC2045
On some kernels GetDevlinkDeviceParam may return empty values
for some kernel parameters. The netlink library is able to handle this, but
the code in GetDevlinkDeviceParam function may panic if unexpected value
received. Add extra checks to avoid panics
Delete webhooks when SriovOperatorConfig is deleted
Fix: GetDevlinkDeviceParam to handle edge-cases correctly
[metrics 5/x] Add node label to sriov_* metrics
`sriov_kubepoddevice` metric might end up in the Prometheus
database after a while, as the default scrape interval is 30s. This leads
to failures in the end-to-end lane like:

```
[sriov] Metrics Exporter When Prometheus operator is available [It] Metrics should have the correct labels
/root/opr-ocp2-1/data/sriov-network-operator/sriov-network-operator/test/conformance/tests/test_exporter_metrics.go:132

  [FAILED] no value for metric sriov_kubepoddevice
```

Put the metric assertion in an `Eventually` statement

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Fixes the following shellcheck error:

    SC2081 (error): [ .. ] can't match globs. Use a case statement.

https://www.shellcheck.net/wiki/SC2081
Warns about shellcheck issues with severity `error`.
metrics: Fix `Metrics should have the correct labels` test
CI: Add a bash linter to pre-submits
openstack: dynamically mount the config-drive
When the operator changes the device-plugin Spec (e.g. .Spec.NodeSelector), it may happen
that there are two device plugin pods for a given node, one that is terminating, the other that is
initializing.
If the config-daemon executes `restartDevicePluginPod()` at the same time, it may kill the terminating
pod, while the initializing one will run with the old dp configuration. This may cause one or more resources
to not being advertised, until a manual device plugin restart occurs.

Make the config-daemon restart all the device-plugin instances it founds for its own node.

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
e0ne and others added 26 commits December 16, 2024 22:36
Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
this is needed because after a reboot on a single node
the operator webhook may not be ready

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
functest: add retry for rdma functionel test
…SET block

When  "$SKIP_VAR_SET" is unset and the environment variables fallback to
the default, the check for valid values should be done. Move the check
out of the $SKIP_VAR_SET block for that.

For the current "hack/env.sh" this maybe not make an actual difference,
because probably the code to assign default values will ensure that
always valid value are set. Note that the openshift variant of the above
code will detect the default via skopeo, which can fail. For that
reason, this change makes more sense for openshift. However, also for
the current code, performing the same error checking after filling out
default values, ensures that the detected values are considered valid
Even if that is in fact always the case, it's not entirely trivial to
see.
[CVE-2024-45338](GHSA-w32m-9786-jp63)

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
bump `golang.org/x/net` to `v0.33.0`
if we run on a system where the PF is not connected to the network
we can still use it for tests but we need the link state to not be
auto.

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
This will fix the issue we sometime see ` <string>: Dump was interrupted and may be
inconsistent.\n`

https://docs.kernel.org/userspace-api/netlink/intro.html#dump-consistency

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
It's enouph to configure ib_core module in /etc/moprobe.d/ for
Ubuntu OS to change RDMA subsystem mode.

Also this commit add OS check into  kargs.sh error because 'grubby'
isn't available in official Ubuntu repositories.

Kernel param configuration support in Ubuntu should be implemented
in a separate commit.

Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
feat: Update controller logic to handle stale SriovNetworkNodeState CRs with delay
Skip kernel parameters configuration for Ubuntu
Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
For using non-default MTU, OVS supports
"mtu_request" field when adding a port to
the bridge.

eg:
https://docs.openvswitch.org/en/latest/topics/dpdk/jumbo-frames/

Signed-off-by: Fred Rolland <frolland@nvidia.com>
with the introduction of rdma system mode change on baremetal
systems it takes more than 1h that is the default for ginkgo

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
[th/hack-env-check] hack/env.sh: move checking of environment variables outside SKIP_VAR_SET block
Support mtu_request for OVS
When creating a bridge with ovs-vsctl, an internal
interface is added by default.
The same behavior is added in this commit

ovs-vsctl code ref:
https://github.com/openvswitch/ovs/blob/main/utilities/ovs-vsctl.c#L1597

Signed-off-by: Fred Rolland <frolland@nvidia.com>
Do not configure BlueField NICs in DPU mode
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Rdma functional tests improvements
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@punkerpunker punkerpunker self-requested a review January 31, 2025 02:50
Copy link

@punkerpunker punkerpunker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫡

@clarkzinzow clarkzinzow merged commit 2e94ce4 into togethercomputer:master Jan 31, 2025
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.