Skip to content

Commit

Permalink
Merge branch 'master' into backupstore
Browse files Browse the repository at this point in the history
  • Loading branch information
yangchiu committed Feb 6, 2024
2 parents 3acaa32 + 8da8ec6 commit b22e5ea
Show file tree
Hide file tree
Showing 163 changed files with 4,636 additions and 1,068 deletions.
2 changes: 0 additions & 2 deletions .drone.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ steps:
image: rancher/dapper:v0.5.3
commands:
- dapper
privileged: true
volumes:
- name: socket
path: /var/run/docker.sock
Expand Down Expand Up @@ -92,7 +91,6 @@ steps:
image: rancher/dapper:v0.5.3
commands:
- dapper
privileged: true
volumes:
- name: socket
path: /var/run/docker.sock
Expand Down
11 changes: 11 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#### Which issue(s) this PR fixes:
<!--
Use `Issue #<issue number>` or `Issue longhorn/longhorn#<issue number>` or `Issue (paste link of issue)`. DON'T use `Fixes #<issue number>` or `Fixes (paste link of issue)`, as it will automatically close the linked issue when the PR is merged.
-->
Issue #

#### What this PR does / why we need it:

#### Special notes for your reviewer:

#### Additional documentation or context
2 changes: 1 addition & 1 deletion .github/workflows/publish.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:

jobs:
publish:
runs-on: [self-hosted, python3.8]
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
Expand Down
2 changes: 1 addition & 1 deletion build_engine_test_images/Dockerfile.setup
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ RUN wget -q https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraf
wget -q "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64" && \
mv yq_linux_amd64 /usr/local/bin/yq && \
chmod +x /usr/local/bin/yq && \
apk add openssh-client ca-certificates git rsync bash curl jq docker && \
apk add openssh-client ca-certificates git rsync bash curl jq && \
ssh-keygen -t rsa -b 4096 -N "" -f ~/.ssh/id_rsa

COPY [".", "$WORKSPACE"]
5 changes: 2 additions & 3 deletions build_engine_test_images/Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,11 @@ node {
usernamePassword(credentialsId: 'DOCKER_CREDS', passwordVariable: 'DOCKER_PASSWORD', usernameVariable: 'DOCKER_USERNAME'),
usernamePassword(credentialsId: 'AWS_CREDS', passwordVariable: 'AWS_SECRET_KEY', usernameVariable: 'AWS_ACCESS_KEY')
]) {
stage('build') {
stage('build') {

sh "build_engine_test_images/scripts/build.sh"

sh """ docker run -itd --privileged -v /var/run/docker.sock:/var/run/docker.sock \
--name ${JOB_BASE_NAME}-${BUILD_NUMBER} \
sh """ docker run -itd --name ${JOB_BASE_NAME}-${BUILD_NUMBER} \
--env TF_VAR_build_engine_aws_access_key=${AWS_ACCESS_KEY} \
--env TF_VAR_build_engine_aws_secret_key=${AWS_SECRET_KEY} \
--env TF_VAR_docker_id=${DOCKER_USERNAME} \
Expand Down
24 changes: 0 additions & 24 deletions build_engine_test_images/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,30 +26,6 @@ if [[ -z "$TF_VAR_docker_repo" ]]; then
exit 1
fi

# if commit_id is empty, we can directly check longhorn-engine:master-head's api version
if [[ -z "${TF_VAR_commit_id}" ]]; then

docker login -u="${TF_VAR_docker_id}" -p="${TF_VAR_docker_password}"
docker pull longhornio/longhorn-engine:master-head
version=`docker run longhornio/longhorn-engine:master-head longhorn version --client-only`
CLIAPIVersion=`echo $version | jq -r ".clientVersion.cliAPIVersion"`
CLIAPIMinVersion=`echo $version | jq -r ".clientVersion.cliAPIMinVersion"`
ControllerAPIVersion=`echo $version | jq -r ".clientVersion.controllerAPIVersion"`
ControllerAPIMinVersion=`echo $version | jq -r ".clientVersion.controllerAPIMinVersion"`
DataFormatVersion=`echo $version | jq -r ".clientVersion.dataFormatVersion"`
DataFormatMinVersion=`echo $version | jq -r ".clientVersion.dataFormatMinVersion"`
echo "latest engine version: ${version}"

upgrade_image="${TF_VAR_docker_repo}:upgrade-test.$CLIAPIVersion-$CLIAPIMinVersion"\
".$ControllerAPIVersion-$ControllerAPIMinVersion"\
".$DataFormatVersion-$DataFormatMinVersion"

if [[ $(docker manifest inspect "${upgrade_image}") != "" ]]; then
echo "latest engine test images have already published"
exit 0
fi
fi

trap ./scripts/cleanup.sh EXIT

# Build amd64 images
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,17 @@ https://github.com/longhorn/longhorn/issues/2285

## Test Multus version below v4.0.0
**Given** Set up the Longhorn environment as mentioned [here](https://longhorn.github.io/longhorn-tests/manual/release-specific/v1.3.0/test-storage-network/)

**When** Run Longhorn core tests on the environment.
**Then** All the tests should pass.

**Then** All the tests should pass.

## Related issue:
https://github.com/longhorn/longhorn/issues/6953

## Test Multus version above v4.0.0
**Given** Set up the Longhorn environment as mentioned [here](https://longhorn.github.io/longhorn-tests/manual/release-specific/v1.6.0/test-storage-network/)

**When** Run Longhorn core tests on the environment.

**Then** All the tests should pass.
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@
title: Cluster using customize kubelet root directory
---

1. Set up a cluster using a customized kubelet root directory.
e.g., launching k3s `k3s server --kubelet-arg "root-dir=/var/lib/longhorn-test" &`
1. Set up a cluster using a customized kubelet root directory.
For example, launching k3s:
- Controller: `k3s server --kubelet-arg "root-dir=/var/lib/longhorn-test"`
- Worker: `k3s agent --kubelet-arg "root-dir=/var/lib/longhorn-test"`
2. Install `Longhorn` with env `KUBELET_ROOT_DIR` in `longhorn-driver-deployer` being set to the corresponding value.
3. Launch a pod using Longhorn volumes via StorageClass. Everything should work fine.
4. Delete the pod and the PVC. Everything should be cleaned up.
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,22 @@ title: "PVC provisioning with insufficient storage"
#### Related Issue:
- https://github.com/longhorn/longhorn/issues/4654
- https://github.com/longhorn/longhorn/issues/3529
- https://github.com/longhorn/longhorn/issues/6461

#### Root Cause Analysis
- https://github.com/longhorn/longhorn/issues/4654#issuecomment-1264870672

This case need to be tested on both RWO/RWX volumes

1. Create a PVC with size larger than 8589934591 GiB.
1. Create a PVC with size larger than `8589934591` GiB.
- Deployment keep in pending status, RWO/RWX volume will keep in a create -> delete loop.
2. Create a PVC with size <= 8589934591 GiB, but greater than the actual available space size.
- RWO/RWX volume will be created, and volume will have annotation "longhorn.io/volume-scheduling-error": "insufficient storage volume scheduling failure" in it.
3. Create a PVC with size < the actual available space size,Resize the PVC to a not schedulable size
1. Create a PVC with size <= `8589934591` GiB, but greater than the actual available space size.
- RWO/RWX volume will be created, and the associated PV for this volume will have annotation "**longhorn.io/volume-scheduling-error**": "**insufficient storage**" in it.
- We can observe "**Scheduling Failure**" and "**Replica Scheduling Failure**" error messages on the Longhorn UI with the following details
- **Scheduling Failure**
- Replica Scheduling Failure
- Error Message: insufficient storage
1. Create a PVC with size < the actual available space size,Resize the PVC to a not schedulable size
- After resize PVC to a not schedulable size, both RWO/RWX were still in scheduling status.

We can modify/use https://raw.githubusercontent.com/longhorn/longhorn/master/examples/rwx/rwx-nginx-deployment.yaml to deploy RWO/RWX PVC for this test
3 changes: 3 additions & 0 deletions docs/content/manual/pre-release/ui/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
title: UI
---
14 changes: 14 additions & 0 deletions docs/content/manual/pre-release/ui/ui-sanity-check.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: ui sanity check
---

1. Access Longhorn UI on `Chrome`, `Firefox` and `Safari` latest/stable version.
1. Check the pages. All the text, form, tables should be proper.
1. Verify all the links at the bottom, they shouldn't be broken and redirects to right pages.
1. Check the setting page, all the settings's text, values should be proper.
1. Create `Backing Image`, `volume`, `pv`, `pvc` and `recurring jobs` using UI.
1. Take `volume snapshot`, create `volume backup`, and `system backup` using UI.
1. Restore `Backup` and `system backup` using UI.
1. Check the `events` on dashboard, they should be normal.
1. Check the logs on the volume detail page, there shouldn't be any error.
1. Check the browser's console, there shouldn't be any error.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Test System Upgrade with New Instance Manager

1. Prepare 3 sets of longhorn-manager and longhorn-instance-manager images.
2. Deploy Longhorn with the 1st set of images.
3. Set `Guaranteed Engine Manager CPU` and `Guaranteed Replica Manager CPU` to 15 and 24, respectively.
3. Set `Guaranteed Instance Manager CPU` to 40, respectively.
Then wait for the instance manager recreation.
4. Create and attach a volume to a node (node1).
5. Upgrade the Longhorn system with the 2nd set of images.
Expand All @@ -13,4 +13,4 @@ title: Test System Upgrade with New Instance Manager
7. Upgrade the Longhorn system with the 3rd set of images.
8. Verify the pods of the 3rd instance manager cannot be launched on node1 since there is no available CPU for the allocation.
9. Detach the volume in the 1st instance manager pod.
Verify the related instance manager pods will be cleaned up and the new instance manager pod can be launched on node1.
Verify the related instance manager pods will be cleaned up and the new instance manager pod can be launched on node1.
3 changes: 3 additions & 0 deletions docs/content/manual/pre-release/v2-volume/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
title: v2 volume
---
14 changes: 14 additions & 0 deletions docs/content/manual/pre-release/v2-volume/sanity-check.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: v2 volume sanity check
---
## Related doc:
https://longhorn.io/docs/1.6.0/v2-data-engine/features/

- Support both amd64 and arm64
- Volume creation, attachment, detachment and deletion
- Automatic offline replica rebuilding
- [Orphaned replica management](https://github.com/longhorn/longhorn/issues/5827)
- Snapshot creation, deletion and reversion
- Volume backup and restoration
- [Selective v2 Data Engine activation](https://github.com/longhorn/longhorn/issues/7015)
- Upgrade Longhorn from previous version with v2 volume
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
title: Test engine version enforcement
---

## Related issue
https://github.com/longhorn/longhorn/issues/5842
https://github.com/longhorn/longhorn/issues/7539

## Test step

**Given** Longhorn v1.4.x cluster running
And create and attach a volume (volume-1)
And upgraded Longhorn to v1.5.x
And create and attach a volume (volume-2)

**When** upgraded Longhorn to v1.6.0
**Then** v1.6.0 longhorn-manager Pods should be in crashloop
```
longhorn-manager-zrf8r 0/1 CrashLoopBackOff 2 (10s ago) 52s
longhorn-manager-zsph2 0/1 CrashLoopBackOff 2 (8s ago) 52s
longhorn-manager-grhsf 0/1 CrashLoopBackOff 2 (8s ago) 51s
```
And should see incompatible version error in longhorn-manager Pod logs
```
time="2023-08-17T03:03:20Z" level=fatal msg="Error starting manager: failed checking Engine upgarde path: incompatible Engine ei-7fa7c208 client API version: found version 7 is below required minimal version 8"
```

**When** downgraded Longhorn to v1.5.x
**Then** Longhorn components should be running

**When** upgraded v1.4.1 volume (volume-1) engine
And upgraded Longhorn to v1.6.0
**Then** Longhorn components should be running
And v1.4.x EngineImage state should be deployed and incompatible should be true.
```
NAME INCOMPATIBLE STATE IMAGE REFCOUNT BUILDDATE AGE
ei-74783864 false deployed longhornio/longhorn-engine:v1.5.1 10 28d 12m
ei-7fa7c208 true deployed longhornio/longhorn-engine:v1.4.1 0 157d 13m
ei-ad420081 false deployed c3y1huang/research:2017-lh-ei 0 44h 24s
```

**When** update existing volume/engine/replica custom resourcs `spec.image` with `longhornio/longhorn-engine:v1.4.x`
**Then** should be blocked
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: Test list backup when cluster has node cordoned before Longhorn installation
---

## Related issue
https://github.com/longhorn/longhorn/issues/7619

## Test step

**Given** a cluster has 3 worker nodes.
**And** 2 worker nodes are cordoned.
**And** Longhorn is installed.

**When** Setting up a backup target.

**Then** no error is observed on the UI Backup page.
**And** Backup custom resources are created if the backup target has existing backups.

Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
title: Test PVC Name and Namespace included in the volume metrics
---

## Related issues

- https://github.com/longhorn/longhorn/issues/5297
- https://github.com/longhorn/longhorn-manager/pull/2284

## Test step

**Given** created 2 volumes (volume-1, volume-2)

**When** PVC created for volume (volume-1)
And attached volumes (volume-1, volume-2)

**Then** metrics with `longhorn_volume_` prefix should include `pvc="volume-1"`

```bash
curl -sSL http://10.0.2.212:32744/metrics | grep longhorn_volume | grep ip-10-0-2-151 | grep volume-1
longhorn_volume_actual_size_bytes{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
longhorn_volume_capacity_bytes{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 1.073741824e+09
longhorn_volume_read_iops{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
longhorn_volume_read_latency{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
longhorn_volume_read_throughput{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
longhorn_volume_robustness{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 1
longhorn_volume_state{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 2
longhorn_volume_write_iops{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
longhorn_volume_write_latency{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
longhorn_volume_write_throughput{pvc_namespace="default",node="ip-10-0-2-151",pvc="volume-1",volume="volume-1"} 0
```

And metrics with `longhorn_volume_` prefix should include `pvc=""` for (volume-2)

```bash
> curl -sSL http://10.0.2.212:32744/metrics | grep longhorn_volume | grep ip-10-0-2-151 | grep volume-2
longhorn_volume_actual_size_bytes{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
longhorn_volume_capacity_bytes{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 1.073741824e+09
longhorn_volume_read_iops{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
longhorn_volume_read_latency{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
longhorn_volume_read_throughput{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
longhorn_volume_robustness{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 1
longhorn_volume_state{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 2
longhorn_volume_write_iops{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
longhorn_volume_write_latency{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
longhorn_volume_write_throughput{pvc_namespace="",node="ip-10-0-2-151",pvc="",volume="volume-2"} 0
```

This file was deleted.

Loading

0 comments on commit b22e5ea

Please sign in to comment.