[v190] fix: volumeClaimTemplates annotation bypass on re-add#10315
[v190] fix: volumeClaimTemplates annotation bypass on re-add#10315martindekov wants to merge 2 commits intoharvester:masterfrom
Conversation
Fixing volumeClaimTemplates bypassing the following scenarios: * oldAnn being nil is now a separate case and once we assign storage class we make sure the new annotation is valid json format for further unmarshalling and storage class exists * now we reject bad formats on the new annotation from the get go * oldVM being nil was concern which we discussed but nil guard is done in upper level so nil poiner panic is not possible * added storage class fakes which return error and extended test cases to cover all the discussed problematic scenarios; all existing tests pass meaning no changes in already existing validation logic only fixing the uncovered bugs Signed-off-by: Martin Dekov <martin.dekov@suse.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes a VM webhook validation gap where harvesterhci.io/volumeClaimTemplates could be deleted and then re-added with malformed content (invalid JSON or non-existent StorageClass) because the prior logic skipped validation when the old annotation was empty.
Changes:
- Adjust
checkVolumeAnnotationsto always unmarshal/validate the new annotation when it changes, including the “old annotation empty” re-add case. - Add explicit StorageClass existence validation for first-time (re-)adds of the annotation.
- Extend unit tests and add a fake StorageClass cache that can simulate internal cache errors.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
pkg/webhook/resources/virtualmachine/validator.go |
Fixes the bypass by validating new annotations even when the old annotation is empty; adds StorageClass existence checks for first-time adds. |
pkg/webhook/resources/virtualmachine/validator_test.go |
Adds/extends update-validation test cases, including malformed JSON, missing StorageClass, and internal cache error scenarios. |
pkg/util/fakeclients/storageclass.go |
Adds an ErroringStorageClassCache test helper to simulate cache failures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
WebberHuang1118
left a comment
There was a problem hiding this comment.
LGTM. Just one refactoring suggestion—thanks for the enhancement.
Signed-off-by: Martin Dekov <martin.dekov@suse.com>
|
|
||
| func (v *vmValidator) checkStorageClassesExist(entries []util.VolumeClaimTemplateEntry) error { | ||
| for _, entry := range entries { | ||
| scName := entry.Spec.StorageClassName |
There was a problem hiding this comment.
v180 added the supporting of datavolume (sc is nested under datavolume), need to double check the datavolume case, if it is also encoded into the annotation, thanks.
e.g.
harvester/pkg/api/volume/handler.go
Line 485 in fdca388
Fixing volumeClaimTemplates bypassing the following scenarios:
Problem:
Old check allowed if oldAnnotation is empty to bypass the check and allow injecting malformed value to the new annotation. Malformed being - non existing storage class/invalid json etc.
Solution:
Make sure we don't skip validation if oldAnnotation is empty and extract that case in separate check to validate the flow of new annotation being added with proper format.
Related Issue(s):
#10165
Test plan:
Unit tests to cover all discussed scenarios were added
E2E Test Plan:
E2E Test Results:
Built environment with the change and created vms and storage classes
Store test values in variables
🟢 1. SC name change - rejected
🟢 2. Storage size reduction - rejected
🟢 3. Normal disk add to existing annotation - allowed
🟢 4. Normal annotation removal - allowed
🟢 5. Re-add with tampered SC after deletion - is now rejection
🟢 6. Re-add with invalid JSON after deletion - rejected
🟢 7. Re-add with valid SC after deletion - allowed
Additional documentation or context
I wanted to fix the bug, but not change the validation behaviour so traced historical evolution of the function body:
#2810 - added to address the original issue - core logic being added
#8772 - storage class validation logic building on top of the previous assumptions
#8669 - mostly refactoring but no logical changes
#10165 - the actual bug and current change