Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two issues encountered when testing LHv2 provisioning #142

Merged
merged 2 commits into from
Sep 27, 2024

Conversation

tserong
Copy link
Contributor

@tserong tserong commented Sep 26, 2024

Problem:

  1. Disks added for use by LHv2 are added successfully, but their status remains stuck with an error "spec and status of disks on node $NAME are being syncing and please retry later."
  2. Active LHv2 volumes attached to VMs appear as /dev/dm-* and /dev/nvme* devices on the host. NDM sees these and thinks they're real disks, and then creates new BD CRs from those devices.

Solution:

  1. Add a call to reflect.DeepEqual() to avoid unnecessary updates and resultant weird state.
  2. Abuse the existing vendor filter to pick up SPDK devices, and explicitly exclude /dev/dm- devices in ApplyExcludeFiltersForDisk().

Note: I intend to handle this more cleanly later as part of harvester/harvester#5059

Related Issue:
harvester/harvester#5274

Test plan:
N/A

Previously, every call to LonghornV2Provisioner.Update() would
trigger an udpate of the Longhorn node object and a requeue
even if nothing had changed.  This resulted in the BD getting
suck with the AddedToNode condition set to False and an error:

  admission webhook "validator.longhorn.io" denied the request:
  spec and status of disks on node harvester-node-0 are being
  syncing and please retry later.

Now we only try to sync if the DiskDriver has actually changed,
and everything behaves correctly.

Signed-off-by: Tim Serong <tserong@suse.com>
When Longhorn V2 volumes are created and attached to VMs, they
appear on Harvester hosts as /dev/dm-* and /dev/nvme* devices.
This is problematic, because NDM thinks those things are actual
disks and creates BD CRs from them.

This change is a bit of a hack until I finish the work for
harvester/harvester#5059.

Signed-off-by: Tim Serong <tserong@suse.com>
Copy link
Collaborator

@Vicente-Cheng Vicente-Cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit, others LGTM!

Thanks for the enhancement.

)

var (
defaultExcludedVendors = []string{vendorFilterDefaultLonghorn}
defaultExcludedVendors = []string{vendorFilterDefaultLonghorn, modelFilterDefaultSPDK}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to change the name of this variable?
Now, we mixed the vendor/module into this slice.

I do not have a strong opinion about changing the name. If you want to keep it, I am also okay with that.
Thanks!

Copy link
Contributor Author

@tserong tserong Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it for now - I know it's a bit silly/incorrect, I just didn't rename it given that I want to rework all this filter stuff anyway separately for harvester/harvester#5059, and keeping the name avoids also tweaking the SetExcludeFilters() function in this commit.

@Vicente-Cheng
Copy link
Collaborator

@Mergifyio backport v0.7.x

Copy link

mergify bot commented Sep 26, 2024

backport v0.7.x

✅ Backports have been created

@Vicente-Cheng
Copy link
Collaborator

Vicente-Cheng commented Sep 26, 2024

Failure is the side-effect that comes from https://github.com/harvester/node-disk-manager/blob/master/pkg/controller/blockdevice/scanner.go#L287

The autoProvision should not update both Spec.Filesystem.Provisioned and Spec.Provision.
If these two fields were true, the remove operation would fail.

We can ignore this error at this PR. I will file another PR to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants