Skip to content

Commit

Permalink
Merge pull request #617 from rook/release-1.14
Browse files Browse the repository at this point in the history
build: Resync from upstream release-1.14 to downstream release-4.16
  • Loading branch information
travisn authored Apr 10, 2024
2 parents e971ea4 + 90b3bb2 commit 700b3bd
Show file tree
Hide file tree
Showing 39 changed files with 463 additions and 267 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/canary-integration-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1087,7 +1087,7 @@ jobs:
with:
name: ${{ github.job }}-${{ matrix.ceph-image }}

encryption-pvc-kms-vault-k:
encryption-pvc-kms-vault-k8s-auth:
runs-on: ubuntu-20.04
if: "!contains(github.event.pull_request.labels.*.name, 'skip-ci')"
strategy:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/golangci-lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ jobs:
steps:
- uses: actions/setup-go@v5
with:
go-version: "1.21"
go-version: "1.22.2"
check-latest: true
- name: govulncheck
uses: golang/govulncheck-action@v1
2 changes: 1 addition & 1 deletion .mergify.yml
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ pull_request_rules:
- "check-success=docs-check"
- "check-success=pylint"
- "check-success=canary (quay.io/ceph/ceph:v18)"
- "check-success=raw-disk (quay.io/ceph/ceph:v18)"
- "check-success=raw-disk-with-object (quay.io/ceph/ceph:v18)"
- "check-success=two-osds-in-device (quay.io/ceph/ceph:v18)"
- "check-success=osd-with-metadata-partition-device (quay.io/ceph/ceph:v18)"
- "check-success=osd-with-metadata-device (quay.io/ceph/ceph:v18)"
Expand Down
8 changes: 4 additions & 4 deletions Documentation/CRDs/Cluster/ceph-cluster-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,11 +195,11 @@ Configure the network that will be enabled for the cluster and services.

* `provider`: Specifies the network provider that will be used to connect the network interface. You can choose between `host`, and `multus`.
* `selectors`: Used for `multus` provider only. Select NetworkAttachmentDefinitions to use for Ceph networks.
* `public`: Select the NetworkAttachmentDefinition to use for the public network.
* `cluster`: Select the NetworkAttachmentDefinition to use for the cluster network.
* `public`: Select the NetworkAttachmentDefinition to use for the public network.
* `cluster`: Select the NetworkAttachmentDefinition to use for the cluster network.
* `addressRanges`: Used for `host` or `multus` providers only. Allows overriding the address ranges (CIDRs) that Ceph will listen on.
* `public`: A list of individual network ranges in CIDR format to use for Ceph's public network.
* `cluster`: A list of individual network ranges in CIDR format to use for Ceph's cluster network.
* `public`: A list of individual network ranges in CIDR format to use for Ceph's public network.
* `cluster`: A list of individual network ranges in CIDR format to use for Ceph's cluster network.
* `ipFamily`: Specifies the network stack Ceph daemons should listen on.
* `dualStack`: Specifies that Ceph daemon should listen on both IPv4 and IPv6 network stacks.
* `connections`: Settings for network connections using Ceph's msgr2 protocol
Expand Down
122 changes: 46 additions & 76 deletions Documentation/CRDs/Cluster/network-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,27 +79,6 @@ to or from host networking after you update this setting, you will need to
[failover the mons](../../Storage-Configuration/Advanced/ceph-mon-health.md#failing-over-a-monitor)
in order to have mons on the desired network configuration.

## CSI Host Networking

Host networking for CSI pods is controlled independently from CephCluster networking. CSI can be
deployed with host networking or pod networking. CSI uses host networking by default, which is the
recommended configuration. CSI can be forced to use pod networking by setting the operator config
`CSI_ENABLE_HOST_NETWORK: "false"`.
When CSI uses pod networking (`"false"` value), it is critical that `csi-rbdplugin`,
`csi-cephfsplugin`, and `csi-nfsplugin` pods are not deleted or updated without following a special
process outlined below. If one of these pods is deleted, it will cause all existing PVCs on the
pod's node to hang permanently until all application pods are restarted.

The process for updating CSI plugin pods is to follow the following steps on each Kubernetes node
sequentially:
1. `cordon` and `drain` the node
2. When the node is drained, delete the plugin pod on the node (optionally, the node can be rebooted)
3. `uncordon` the node
4. Proceed to the next node when pods on the node rerun and stabilize

For modifications, see [Modifying CSI Networking](#modifying-csi-networking).

## Multus
`network.provider: multus`

Expand All @@ -114,13 +93,6 @@ isolation.

### Multus Prerequisites

These prerequisites apply when:
- CephCluster `network.selector['public']` is specified, AND
- Operator config `CSI_ENABLE_HOST_NETWORK` is `"true"` (or unspecified), AND
- Operator config `CSI_DISABLE_HOLDER_PODS` is `"true"`

If any of the above do not apply, these prerequisites can be skipped.

In order for host network-enabled Ceph-CSI to communicate with a Multus-enabled CephCluster, some
setup is required for Kubernetes hosts.

Expand All @@ -135,14 +107,14 @@ Two basic requirements must be met:
These two requirements can be broken down further as follows:

1. For routing Kubernetes hosts to the Multus public network, each host must ensure the following:
1. the host must have an interface connected to the Multus public network (the "public-network-interface").
2. the "public-network-interface" must have an IP address.
3. a route must exist to direct traffic destined for pods on the Multus public network through
the "public-network-interface".
1. the host must have an interface connected to the Multus public network (the "public-network-interface").
2. the "public-network-interface" must have an IP address.
3. a route must exist to direct traffic destined for pods on the Multus public network through
the "public-network-interface".
2. For routing pods on the Multus public network to Kubernetes hosts, the public
NetworkAttachementDefinition must be configured to ensure the following:
1. The definition must have its IP Address Management (IPAM) configured to route traffic destined
for nodes through the network.
1. The definition must have its IP Address Management (IPAM) configured to route traffic destined
for nodes through the network.
3. To ensure routing between the two networks works properly, no IP address assigned to a node can
overlap with any IP address assigned to a pod on the Multus public network.

Expand All @@ -155,10 +127,6 @@ understand and implement these requirements.
need to be an order of magnitude larger (or more) than the host address space to allow the
storage cluster to grow in the future.

If these prerequisites are not achievable, the remaining option is to set the Rook operator config
`CSI_ENABLE_HOST_NETWORK: "false"` as documented in the [CSI Host Networking](#csi-host-networking)
section.
### Multus Configuration

Refer to [Multus documentation](https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md)
Expand Down Expand Up @@ -265,6 +233,7 @@ writing it's unclear when this will be supported.
#### Macvlan, Whereabouts, Node Dynamic IPs

The network plan for this cluster will be as follows:

- The underlying network supporting the public network will be attached to hosts at `eth0`
- Macvlan will be used to attach pods to `eth0`
- Pods and nodes will have separate IP ranges
Expand Down Expand Up @@ -323,6 +292,7 @@ spec:
#### Macvlan, Whereabouts, Node Static IPs

The network plan for this cluster will be as follows:

- The underlying network supporting the public network will be attached to hosts at `eth0`
- Macvlan will be used to attach pods to `eth0`
- Pods and nodes will share the IP range 192.168.0.0/16
Expand Down Expand Up @@ -381,6 +351,7 @@ spec:
#### Macvlan, DHCP

The network plan for this cluster will be as follows:

- The underlying network supporting the public network will be attached to hosts at `eth0`
- Macvlan will be used to attach pods to `eth0`
- Pods and nodes will share the IP range 192.168.0.0/16
Expand Down Expand Up @@ -418,19 +389,33 @@ spec:
}'
```

## Modifying CSI networking
## Holder Pod Deprecation

Rook plans to remove CSI "holder" pods in Rook v1.16. CephCluster with `csi-*plugin-holder-*` pods
present in the Rook operator namespace must plan to set `CSI_DISABLE_HOLDER_PODS` to `"true"` after
Rook v1.14 is installed and before v1.16 is installed by following the migration sections below.
CephClusters with no holder pods do not need to follow migration steps.

Helm users will set `csi.disableHolderPods: true` in values.yaml instead of `CSI_DISABLE_HOLDER_PODS`.

### Disabling Holder Pods with Multus and CSI Host Networking
CephClusters that do not use `network.provider: multus` can follow the
[Disabling Holder Pods](#disabling-holder-pods) section.

This migration section applies in the following scenarios:
- CephCluster `network.provider` is `"multus"`, AND
- Operator config `CSI_DISABLE_HOLDER_PODS` is changed to `"true"`, AND
- Operator config `CSI_ENABLE_HOST_NETWORK` is (or is modified to be) `"true"`
CephClusters that use `network.provider: multus` will need to plan the migration more carefully.
Read the [Disabling Holder Pods with Multus](#disabling-holder-pods-with-multus) section in full
before beginning.

If the scenario does not apply, skip ahead to the
[Disabling Holder Pods](#disabling-holder-pods) section below.
!!! hint
To determine if holder pods are deployed, use
`kubectl --namespace $ROOK_OPERATOR get pods | grep plugin-holder`

### Disabling Holder Pods with Multus

This migration section applies when any CephCluster `network.provider` is `"multus"`. If the
scenario does not apply, skip ahead to the [Disabling Holder Pods](#disabling-holder-pods) section.

**Step 1**

Before setting `CSI_ENABLE_HOST_NETWORK: "true"` and `CSI_DISABLE_HOLDER_PODS: "true"`, thoroughly
read through the [Multus Prerequisites section](#multus-prerequisites). Use the prerequisites
section to develop a plan for modifying host configurations as well as the public
Expand All @@ -439,21 +424,25 @@ NetworkAttachmentDefinition.
Once the plan is developed, execute the plan by following the steps below.

**Step 2**

First, modify the public NetworkAttachmentDefinition as needed. For example, it may be necessary to
add the `routes` directive to the Whereabouts IPAM configuration as in
[this example](#macvlan-whereabouts-node-static-ips).

**Step 3**

Next, modify the host configurations in the host configuration system. The host configuration system
may be something like PXE, ignition config, cloud-init, Ansible, or any other such system. A node
reboot is likely necessary to apply configuration updates, but wait until the next step to reboot
nodes.

**Step 4**

After the NetworkAttachmentDefinition is modified, OSD pods must be restarted. It is easiest to
complete this requirement at the same time nodes are being rebooted to apply configuration updates.

For each node in the Kubernetes cluster:

1. `cordon` and `drain` the node
2. Wait for all pods to drain
3. Reboot the node, ensuring the new host configuration will be applied
Expand All @@ -467,6 +456,7 @@ restarted as part of the `drain` and `undrain` process on each node.
OSDs can be restarted manually if node configuration updates do not require reboot.

**Step 5**

Once all nodes are running the new configuration and all OSDs have been restarted, check that the
new node and NetworkAttachmentDefinition configurations are compatible. To do so, verify that each
node can `ping` OSD pods via the public network.
Expand Down Expand Up @@ -502,24 +492,27 @@ direction, or the network switch may have a firewall rule blocking the connectio
the issue, then return to **Step 1**.

**Step 6**

If the above check succeeds for all nodes, proceed with the
[Disabling Holder Pods](#disabling-holder-pods) steps below.

### Disabling Holder Pods

This migration section applies when `CSI_DISABLE_HOLDER_PODS` is changed to `"true"`.

**Step 1**

If any CephClusters have Multus enabled (`network.provider: "multus"`), follow the
[Disabling Holder Pods with Multus and CSI Host Networking](#disabling-holder-pods-with-multus-and-csi-host-networking)
[Disabling Holder Pods with Multus](#disabling-holder-pods-with-multus)
steps above before continuing.
**Step 2**
Begin by setting `CSI_DISABLE_HOLDER_PODS: "true"` (and `CSI_ENABLE_HOST_NETWORK: "true"` if desired).
Begin by setting `CSI_DISABLE_HOLDER_PODS: "true"`. If `CSI_ENABLE_HOST_NETWORK` is set to
`"false"`, also set this value to `"true"` at the same time.

After this, `csi-*plugin-*` pods will restart, and `csi-*plugin-holder-*` pods will remain running.

**Step 3**

Check that CSI pods are using the correct host networking configuration using the example below as
guidance (in the example, `CSI_ENABLE_HOST_NETWORK` is `"true"`):
```console
Expand All @@ -532,10 +525,12 @@ $ kubectl -n rook-ceph get -o yaml daemonsets.apps csi-nfsplugin | grep -i hostn
```

**Step 4**

At this stage, PVCs for running applications are still using the holder pods. These PVCs must be
migrated from the holder to the new network. Follow the below process to do so.

For each node in the Kubernetes cluster:

1. `cordon` and `drain` the node
2. Wait for all pods to drain
3. Delete all `csi-*plugin-holder*` pods on the node (a new holder will take it's place)
Expand All @@ -544,6 +539,7 @@ For each node in the Kubernetes cluster:
6. Proceed to the next node

**Step 5**

After this process is done for all Kubernetes nodes, it is safe to delete the `csi-*plugin-holder*`
daemonsets.

Expand All @@ -561,31 +557,5 @@ daemonset.apps "csi-rbdplugin-holder-my-cluster" deleted
```

**Step 6**
The migration is now complete! Congratulations!

### Applying CSI Networking

This migration section applies in the following scenario:
- `CSI_ENABLE_HOST_NETWORK` is modified, AND
- `CSI_DISABLE_HOLDER_PODS` is `"true"`

**Step 1**
If `CSI_DISABLE_HOLDER_PODS` is unspecified or is `"false"`, follow the
[Disabling Holder Pods](#disabling-holder-pods) section first.

**Step 2**
Begin by setting the desired `CSI_ENABLE_HOST_NETWORK` value.

**Step 3**
At this stage, PVCs for running applications are still using the the old network. These PVCs must be
migrated to the new network. Follow the below process to do so.

For each node in the Kubernetes cluster:
1. `cordon` and `drain` the node
2. Wait for all pods to drain
3. `uncordon` and `undrain` the node
4. Wait for the node to be rehydrated and stable
5. Proceed to the next node

**Step 4**
The migration is now complete! Congratulations!
2 changes: 1 addition & 1 deletion Documentation/CRDs/specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -5556,7 +5556,7 @@ string
</td>
<td>
<em>(Optional)</em>
<p>The IP of this endpoint. As a legacy behavior, this supports being given a DNS-adressable hostname as well.</p>
<p>The IP of this endpoint. As a legacy behavior, this supports being given a DNS-addressable hostname as well.</p>
</td>
</tr>
<tr>
Expand Down
4 changes: 2 additions & 2 deletions Documentation/Contributing/rook-test-framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ virtual machine.
make build
```

Tag the newly built images to `rook/ceph:local-build` for running tests, or `rook/ceph:v1.14.0-beta.0` if creating example manifests::
Tag the newly built images to `rook/ceph:local-build` for running tests, or `rook/ceph:master` if creating example manifests::

```console
docker tag $(docker images|awk '/build-/ {print $1}') rook/ceph:local-build
docker tag rook/ceph:local-build rook/ceph:v1.14.0-beta.0
docker tag rook/ceph:local-build rook/ceph:master
```

## Run integration tests
Expand Down
2 changes: 1 addition & 1 deletion Documentation/Getting-Started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ To configure the Ceph storage cluster, at least one of these local storage optio
A simple Rook cluster is created for Kubernetes with the following `kubectl` commands and [example manifests](https://github.com/rook/rook/blob/master/deploy/examples).

```console
$ git clone --single-branch --branch v1.14.0-beta.0 https://github.com/rook/rook.git
$ git clone --single-branch --branch v1.14.0 https://github.com/rook/rook.git
cd rook/deploy/examples
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
kubectl create -f cluster.yaml
Expand Down
Loading

0 comments on commit 700b3bd

Please sign in to comment.