-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block: add support for discard #329
Comments
Yes. You probably want to look at #325 first though. |
Since this interface should be sector aligned, should we break consistency and take offsets and length as sectors? Would it simplify the checking the spt backend would need to do in case the implementation is specialized there (re #325 (comment))? |
That depends on what the backend implementation needs to do. As I've never needed to invoke a The Solo5 interface I would prefer is:
with the usual restrictions ( If you find out what the backend(s) actually need(s) to do to implement a discard (i.e. at least the |
With spt bindings on Linux, this will use fallocate with I'm not sure about tenders/hvt because that should just validate and pass params? I've also looked at virtio (spec amendment); it would use the VIRTIO_BLK_T_WRITE_ZEROES command with the unmap bit set (with some fallbacks which are easy to add). The offsets and lengths are expressed in 512-byte units, but the alignment requirement may be greater. |
(just to document the virtio fallbacks while the spec is fresh on my mind: if VIRTIO_BLK_F_WRITE_ZEROES is not supported, do the whole thing manually. Otherwise, call VIRTIO_BLK_T_WRITE_ZEROES with unmap set to whether VIRTIO_BLK_F_DISCARD is supported) |
That man page does not actually mention anything about the call issuing a Anyway, so the Linux backends need to issue a For hvt, you need to create a hypercall and back it with that implementation. Before you actually start on this, can you check with the *BSD folks you have around at the retreat what the equivalent calls are there? |
Oh. One more thing, just re-reading the manpage. It only talks about filesystem-backed fds. How do we do the equivalent if the |
Linux also implements it for block devices: At any rate, the logic for this will be the same regardless of what the backing store is; if ENOTSUP is returned, write zeroes manually. |
I don't think this exists on BSD per this recent thread: https://freebsd-arch.freebsd.narkive.com/JKZnzc4p/hole-punching-trim-etc |
That's implementation and configuration (mount options, actual device capabilities) dependent. |
Right, just make sure that you do that in a single |
@g2p Moving your questions from Slack here, as I'd like the design discussion to be in one place:
Correct.
I don't think keeping a 1M buffer around is an acceptable cost to pay just for the sake of being able to emulate discard. Stepping back a bit, do we need to emulate it at all? Why? In my mind, discard is just an "optimization" for SSDs and/or thinly provisioned storage. If the host doesn't support it, can't it just be a no-op? |
I need it to actually ensure later reads return zeroes. |
With further thought, I'd be good with flags that ask for zeroing or not, in addition to the out parameter and higher-level fallback logic mentioned above. |
Ok. It seems to me then that the simplest possible interface at the Solo5 layer is that mentioned in my previous comment (#329 (comment)), with the addition of a In other words, no emulation, and the application flow becomes:
I'm not sure what you mean by that, but if you mean adding a "is discard supported" flag to the struct returned by |
I like the EOPNOTSUPP approach very much, will do that. Re an explicit "ask for zeroing" flag (as opposed to discarding and having uninitialized data appear), it's not something I plan to use, and after looking at the complexity of implementing both code paths in virtio, I don't think it's worth it. We'll either offer safe and fast zeroing, or let the application deal. |
Sounds good to me. |
I've started testing my virtio implementation, but it looks like Qemu doesn't support discard for virtio-blk, only for scsi. I don't know of another supported VMM on Linux so I'll drop that code for now. Still making progress on the spt/hvt backends. |
crosvm has support for the feature: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/7621d910f56ff85400b252f88fdef324a1cc13d6%5E%21/#F0 via https://bugs.chromium.org/p/chromium/issues/detail?id=850998 . Not sure how hard it would be to integrate. |
I've built and run crosvm, but I only get messages from early boot, I don't think it can run Solo5's kernel. |
The idea is to support operations compatible with mirage-block-unix's discard:
Which I would update to say this (matching the Unix implementation):
The text was updated successfully, but these errors were encountered: