Skip to content

Kubevirt: cdiupload retry, thin metrics #4509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2025

Conversation

andrewd-zededa
Copy link
Contributor

  • Implement retries of CDI upload to pvc to handle
    intermittent timeouts to k8s api.
  • Switch kubevirt memory metric as "_total" suffix
    was removed from "kubevirt_vmi_memory_domain_bytes"
  • Create DiskMetric with names for sdX and pvc name to help
    match them in system debug.
  • Fix AppDiskMetric thin allocation reporting by moving
    csihandler GetVolumeDetails to reading allocated space
    from lh volume object and Populate() filling in FileLocation.

@andrewd-zededa andrewd-zededa force-pushed the kubevirt-volumemgr-cdi branch 2 times, most recently from 43665e9 to 7c6799c Compare January 31, 2025 18:24
@andrewd-zededa
Copy link
Contributor Author

With the rebase I think this PR is approaching ready for review. I'll wait for the latest CI runs.

Copy link

codecov bot commented Jan 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 20.90%. Comparing base (ba08639) to head (dfeb823).
Report is 20 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4509   +/-   ##
=======================================
  Coverage   20.90%   20.90%           
=======================================
  Files          13       13           
  Lines        2894     2894           
=======================================
  Hits          605      605           
  Misses       2163     2163           
  Partials      126      126           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


// waitForPVCUploadComplete: Loop until PVC upload annotations show upload complete
func waitForPVCUploadComplete(pvcName string, log *base.LogObject) error {
clientset, err := GetClientSet()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From where might we call this function which can take 500 seconds? Does it need to kick the watchdog?

Copy link
Contributor Author

@andrewd-zededa andrewd-zededa Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called in kubeapi/vitoapiserver.go RolloutDiskToPVC() which runs in the volume create worker context. I do see now that volumemgr calls a log.Fatal if AddWorkCreate -> TrySubmit returns queue full which seems to be set to a constant 20 right now.

Looks like the watchdog might not need to be kicked but if we get a burst of 20 queued up volume creates behind a slow one then the log.Fatal will take down pillar anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see one issue in that waitForPVCUploadComplete is not using the caller's context which can cancel if the volume create is cancelled. I'll submit a change to handle that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its in the latest push

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. If it runs from the volume create handler then we don't need to worry about a watchdog.
But it makes sense stating that in a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just added a comment over kubeapi.RolloutDiskToPVC and kubeapi.waitForPVCUploadComplete

@github-actions github-actions bot requested a review from eriknordmark February 3, 2025 22:34
@andrewd-zededa andrewd-zededa marked this pull request as ready for review February 3, 2025 23:00
@andrewd-zededa
Copy link
Contributor Author

Realized it wasn't marked ready for review, and we're already reviewing anyways so I changed it.

@@ -29,7 +29,8 @@ func StatAllocatedBytes(path string) (uint64, error) {
if err != nil {
return uint64(0), err
}
return uint64(stat.Blocks * int64(stat.Blksize)), nil
// stat.Blocks is always 512-byte blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, although POSIX says is implementation defined. Here, stat.Blksize should be usually 4Kb, which means this function was reporting 8x bigger the true allocated size... this is a bug, I would separate this fix into a dedicated commit (or even a PR)....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can split this to another PR, as you noted POSIX does define it as implementation defined. I missed that musl's <sys/param.h> does define it under DEV_BSIZE. I could move this to another PR and pull that in with cgo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it 512 even on zfs? Does this 8x overreporting show up in the UI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this change for now, will submit in another PR shortly

@@ -128,10 +128,17 @@ func (metrics *kubevirtMetrics) fill(domainName, metricName string, value interf
cpuNs := assignToInt64(value) * int64(time.Second)
r.CPUTotalNs = r.CPUTotalNs + uint64(cpuNs)
case "kubevirt_vmi_memory_usable_bytes":
// The amount of memory which can be reclaimed by balloon without pushing the guest system to swap,
// corresponds to ‘Available’ in /proc/meminfo
// https://kubevirt.io/monitoring/metrics.html#kubevirt
r.AvailableMemory = uint32(assignToInt64(value)) / BytesInMegabyte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just curious here: metrics are called ..._bytes although the return value it's in MB.... is this expected?

Also, is uint32 enough? That limits the maximum value to 4GB....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the uint32 downcast here before the division is incorrect, fix in the next push.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in latest push

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run tests

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restart tests

- Implement retries of CDI upload to pvc to handle
intermittent timeouts to k8s api.
- Switch kubevirt memory metric as "_total" suffix
was removed from "kubevirt_vmi_memory_domain_bytes"
- Create DiskMetric with names for sdX and pvc name to help
match them in system debug.
- Fix AppDiskMetric thin allocation reporting by moving
csihandler GetVolumeDetails to reading allocated space
from lh volume object and Populate() filling in FileLocation.
- Use thin allocation check when dir has prefix of
types.VolumeEncryptedDirName or types.VolumeClearDirName
to handle subdirs like kubevirt /persist/vault/volumes/replicas/...
- waitForPVCUploadComplete needs to use caller's context
to allow quicker exit if volume create is cancelled

Signed-off-by: Andrew Durbin <andrewd@zededa.com>
@andrewd-zededa
Copy link
Contributor Author

Another push for missing comments.

@eriknordmark eriknordmark merged commit 15144fd into lf-edge:master Feb 7, 2025
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants