Skip to content

Commit

Permalink
Merge pull request rook#14230 from BlaineEXE/multus-validation-test-a…
Browse files Browse the repository at this point in the history
…dd-host-checking

multus: add host checking to validation tool
  • Loading branch information
BlaineEXE authored Jul 11, 2024
2 parents 28addd1 + 33f5407 commit fd3f4c8
Show file tree
Hide file tree
Showing 16 changed files with 521 additions and 258 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/multus.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ jobs:
- name: Setup multus
run: ./tests/scripts/multus/setup-multus.sh

- name: Set up multus prerequisite host routing
run: kubectl create -f tests/scripts/multus/host-cfg-ds.yaml

- name: Install public and cluster NADs in default namespace
run: kubectl create -f tests/scripts/multus/default-public-cluster-nads.yaml

Expand Down
47 changes: 34 additions & 13 deletions Documentation/CRDs/Cluster/network-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,26 +207,43 @@ ranges could be manually specified for the networks if needed.

### Validating Multus configuration

We **highly** recommend validating your Multus configuration before you install Rook. A tool exists
to facilitate validating the Multus configuration. After installing the Rook operator and before
installing any Custom Resources, run the tool from the operator pod.
We **highly** recommend validating your Multus configuration before you install a CephCluster.
A tool exists to facilitate validating the Multus configuration. After installing the Rook operator
and before installing any Custom Resources, run the tool from the operator pod.

The tool's CLI is designed to be as helpful as possible. Get help text for the multus validation
tool like so:

```console
kubectl --namespace rook-ceph exec -it deploy/rook-ceph-operator -- rook multus validation run --help
```
1. Exec into the Rook operator pod

```console
kubectl --namespace rook-ceph exec -it deploy/rook-ceph-operator -- bash
```

2. Output and read the tool's help text

Then, update the args in the
[multus-validation](https://github.com/rook/rook/blob/master/deploy/examples/multus-validation.yaml)
job template. Minimally, add the NAD names(s) for public and/or cluster as needed and then,
create the job to validate the Multus configuration.
```console
rook multus validation run --help
```

If the tool fails, it will suggest what things may be preventing Multus networks from working
properly, and it will request the logs and outputs that will help debug issues.
3. Use the validation tool config file for advanced configuration.

Check the logs of the pod created by the job to know the status of the validation test.
```console
rook multus validation config --help
```

Generate a sample config, that includes commented help text, using one of the available templates.

4. Run the tool after configuring. If the tool fails, it will suggest what things may be preventing
Multus networks from working properly, and it will request the logs and outputs that will help
debug issues.

!!! note
The tool requires host network access. Many Kubernetes distros have security limitations. Use
the tool's `serviceAccountName` config option or `--service-account-name` CLI flag to instruct
the tool to run using a particular ServiceAccount in order to allow necessary permissions.
An example compatible with openshift is provided in the Rook repository at
[deploy/examples/multus-validation-test-openshift.yaml](https://github.com/rook/rook/blob/master/deploy/examples/multus-validation-test-openshift.yaml)

### Known limitations with Multus

Expand Down Expand Up @@ -445,6 +462,10 @@ may be something like PXE, ignition config, cloud-init, Ansible, or any other su
reboot is likely necessary to apply configuration updates, but wait until the next step to reboot
nodes.

If desired, check that the NetworkAttachmentDefinition modification and host configurations are
compatible using the [Multus validation tool](#validating-multus-configuration). For the upgrade
case, use the `hostCheckOnly: true` config option or `--host-check-only` CLI flag.

**Step 4**

After the NetworkAttachmentDefinition is modified, OSD pods must be restarted. It is easiest to
Expand Down
1 change: 0 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,6 @@ prune: ## Prune cached artifacts.
@$(MAKE) -C images prune

docs: helm-docs
@build/deploy/generate-deploy-examples.sh

crds: $(CONTROLLER_GEN) $(YQ)
@echo Updating CRD manifests
Expand Down
24 changes: 0 additions & 24 deletions build/deploy/generate-deploy-examples.sh

This file was deleted.

14 changes: 13 additions & 1 deletion cmd/rook/userfacing/multus/validation/validation.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ var (

// keep special var for `--daemons-per-node` that needs put into node config for validation run
flagDaemonsPerNode = -1

// keep special var for --host-check-only flag that can override what is from config file
flagHostCheckOnly = false
)

// commands
Expand Down Expand Up @@ -131,6 +134,9 @@ func init() {
"The default value is set to the worst-case value for a Rook Ceph cluster with 3 portable OSDs, 3 portable monitors, "+
"and where all optional child resources have been created with 1 daemon such that they all might run on a single node in a failure scenario. "+
"If you aren't sure what to choose for this value, add 1 for each additional OSD beyond 3.")
runCmd.Flags().BoolVar(&flagHostCheckOnly, "host-check-only", defaultConfig.HostCheckOnly,
"Only check that hosts can connect to the server via the public network. Do not start clients. "+
"This mode is recommended when a Rook cluster is already running and consuming the public network specified.")
runCmd.Flags().StringVar(&validationConfig.NginxImage, "nginx-image", defaultConfig.NginxImage,
"The Nginx image used for the validation server and clients.")

Expand All @@ -147,7 +153,8 @@ func init() {
"clients to start, and it therefore may take longer for all clients to become 'Ready'; in that case, this value can be set slightly higher.")

runCmd.Flags().StringVarP(&validationConfigFile, "config", "c", "",
"The validation test config file to use. This cannot be used with other flags.")
"The validation test config file to use. This cannot be used with other flags except --host-check-only.")
// allow using --host-check-only in combo with --config so the same config can be used with that flag if desired
runCmd.MarkFlagsMutuallyExclusive("config", "timeout-minutes")
runCmd.MarkFlagsMutuallyExclusive("config", "namespace")
runCmd.MarkFlagsMutuallyExclusive("config", "public-network")
Expand Down Expand Up @@ -184,6 +191,11 @@ func runValidation(ctx context.Context) {
}
}

// allow --host-check-only(=true) flag to override default/configfile settings
if flagHostCheckOnly {
validationConfig.HostCheckOnly = true
}

if err := validationConfig.ValidationTestConfig.Validate(); err != nil {
fmt.Print(err.Error() + "\n")
os.Exit(22 /* EINVAL */)
Expand Down
38 changes: 38 additions & 0 deletions deploy/examples/multus-validation-test-openshift.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# ServiceAccount and RBAC to support running multus validation test on OpenShift
# Deploy these resources, then use `serviceAccountName: multus-validation-test` in the validation
# test config file.
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: multus-validation-test
namespace: openshift-storage
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: multus-validation-test
namespace: openshift-storage
rules:
- apiGroups:
- security.openshift.io
resourceNames:
- hostnetwork-v2
resources:
- securitycontextconstraints
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: multus-validation-test
namespace: openshift-storage
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: multus-validation-test
subjects:
- kind: ServiceAccount
name: multus-validation-test
namespace: openshift-storage
173 changes: 0 additions & 173 deletions deploy/examples/multus-validation.yaml

This file was deleted.

Loading

0 comments on commit fd3f4c8

Please sign in to comment.