-
Notifications
You must be signed in to change notification settings - Fork 393
Fix and optimize 304 object copying #4013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a problem with this patch set wrt to revalidation of objects that are still streaming?
That is, due to a TTL that is shorter than the time it takes to fetch the object, a 304 revalidation is started which then needs to stream the data from the still fetching object while creating the revalidated object.
I wonder if this will then be using the wrong length (OA_LEN not yet set, which happens in vbf_stp_fetchend), causing an assert it looks like. It would I think regardless mess up both the allocation sizes and the logic to set VFP_END once and only once.
Hmm, seems I was confused about the behaviour here. We do an explicit wait for BOS_FINISHED before creating the revalidated object, so that should be OK. |
There is a problem with the patch when revalidating zero length objects. Varnish won't attempt to do a revalidation for these objects (when the length is zero), but a missbehaving backend answering 304 anyways will cause a panic. See the test case.
|
@mbgrydeland good catch! I will add the test case and then improve this PR. On the previous comment: Yes, we only copy the stale object once it is complete. |
Test case by Martin Blix Grydeland, taken from #4013
@mbgrydeland I am going to force-push a simple removal of the triggering assertion: diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c
index 3e0d7cdc3..6fb47975f 100644
--- a/bin/varnishd/cache/cache_fetch.c
+++ b/bin/varnishd/cache/cache_fetch.c
@@ -883,7 +883,6 @@ vbf_stp_condfetch(struct worker *wrk, struct busyobj *bo)
INIT_OBJ(vop, VBF_OBITER_PRIV_MAGIC);
vop->bo = bo;
vop->l = ObjGetLen(bo->wrk, stale_oc);
- assert(vop->l > 0);
if (ObjIterate(wrk, stale_oc, vop, vbf_objiterate, 0))
(void)VFP_Error(bo->vfc, "Template object failed");
IMHO, skipping the iterator for a zero length object would also be an option, but we did call it before this PR and there might be future code depending on it. So let's keep it and have the iterator do its noop. |
8ec7719
to
d7e0b26
Compare
I'm wondering if there aren't some potential problems unrelated to this patch that we have uncovered. The fact that Varnish will enter 304 revalidation logic even though we never asked for it (didn't send out any IMS/INM headers) is worrying to me. I wonder if we shouldn't keep track somehow of whether to expect a 304, and error out if we still got a 304. Could prevent a future gotcha at least. |
bugwash: waiting for @mbgrydeland's ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you end up dropping the panic test case from the PR? I can't see it as part of the commits currently.
I added it directly to trunk: 3fc690d 9a4653b @mbgrydeland sorry for accidentally re-requesting your review too early. |
It is my understanding that the objtrimstore stevedore API function is only to be called once when the object is complete, which I believe is also in line with the comment on ObjExtend() that "The final flag must be set on the last call". If this understanding of the API is correct, we did not adhere to it in the fetch code when we made a copy of an existing stale object after a 304 response: There, we iterated over the stale object and did not set the final flag just once when the object was complete, but rather after each storage segment was copied. This commit fixes this, adds some pedentry to the simple storage and extends b00062.vtc to test this behavior specifically. On top, g6.vtc also triggered without the fix but the duplicate trim detection in place. This issue has originally surfaced in the SLASH/fellow storage where the trimstore function implicitly asserted to only be called once. Ref 115742b Ref https://gitlab.com/uplex/varnish/slash/-/issues/33
When copying a stale object after a 304 revalidation, we iterated over it and allocated new storage for each storage segment. So, at best, we kept the fragmentation of the existing object, or made it even worse. This is particularly relevant when the existing object was created from a chunked response, in which case the original segments might have been particularly small and, consequently, many in number. Because we wait for the stale object to complete, we know the total length (OA_LEN) upfront, and can ask the storage serving the copy for exactly the right length. This is much more efficient, as fragmentation is lowered and the storage engine might make a much better allocation decision when it knows the full length rather than getting piecemeal requests. We implement the improved allocation scheme through additional state kept for the duration of the template object copy: struct vbf_objiter_priv holds the pointer to the newly created busy object, the total yet unallocated length (initialized to OA_LEN of the existing object) and pointer/length to the unused portion of the currently open segment, if any.
d7e0b26
to
59fe52c
Compare
we need a signal if the fix is in
(edit: Added a second commit which depends on the first)
TL;DR: Fix trimstore use in 304 and copy objects efficiently ("defragmented")
Commit 1: Call trimstore only once when copying the body after a 304
It is my understanding that the objtrimstore stevedore API function is only to be called once when the object is complete, which I believe is also in line with the comment on
ObjExtend()
that The final flag must be set on the last call.If this understanding of the API is correct, we did not adhere to it in the fetch code when we made a copy of an existing stale object after a 304 response: There, we iterated over the stale object and did not set the final flag just once when the object was complete, but rather after each storage segment was copied.
This commit fixes this, adds some pedentry to the simple storage and extends b00062.vtc to test this behavior specifically. On top, g6.vtc also triggered without the fix but the duplicate trim detection in place.
This issue has originally surfaced in the SLASH/fellow storage where the trimstore function implicitly asserted to only be called once.
Ref 115742b
Ref https://gitlab.com/uplex/varnish/slash/-/issues/33
Commit 2: Optimize 304 template object copying
When copying a stale object after a 304 revalidation, we iterated over it and allocated new storage for each storage segment.
So, at best, we kept the fragmentation of the existing object, or made it even worse. This is particularly relevant when the existing object was created from a chunked response, in which case the original segments might have been particularly small and, consequently, many in number.
Because we wait for the stale object to complete, we know the total length (
OA_LEN
) upfront, and can ask the storage serving the copy for exactly the right length. This is much more efficient, as fragmentation is lowered and the storage engine might make a much better allocation decision when it knows the full length rather than getting piecemeal requests.We implement the improved allocation scheme through additional state kept for the duration of the template object copy:
struct vbf_objiter_priv
holds the pointer to the newly created busy object, the total yet unallocated length (initialized toOA_LEN
of the existing object) and pointer/length to the unused portion of the currently open segment, if any.