Fix and optimize 304 object copying #4013

nigoroll · 2023-10-31T13:40:49Z

(edit: Added a second commit which depends on the first)

TL;DR: Fix trimstore use in 304 and copy objects efficiently ("defragmented")

Commit 1: Call trimstore only once when copying the body after a 304

It is my understanding that the objtrimstore stevedore API function is only to be called once when the object is complete, which I believe is also in line with the comment on ObjExtend() that The final flag must be set on the last call.

If this understanding of the API is correct, we did not adhere to it in the fetch code when we made a copy of an existing stale object after a 304 response: There, we iterated over the stale object and did not set the final flag just once when the object was complete, but rather after each storage segment was copied.

This commit fixes this, adds some pedentry to the simple storage and extends b00062.vtc to test this behavior specifically. On top, g6.vtc also triggered without the fix but the duplicate trim detection in place.

This issue has originally surfaced in the SLASH/fellow storage where the trimstore function implicitly asserted to only be called once.

Ref 115742b
Ref https://gitlab.com/uplex/varnish/slash/-/issues/33

Commit 2: Optimize 304 template object copying

When copying a stale object after a 304 revalidation, we iterated over it and allocated new storage for each storage segment.

So, at best, we kept the fragmentation of the existing object, or made it even worse. This is particularly relevant when the existing object was created from a chunked response, in which case the original segments might have been particularly small and, consequently, many in number.

Because we wait for the stale object to complete, we know the total length (OA_LEN) upfront, and can ask the storage serving the copy for exactly the right length. This is much more efficient, as fragmentation is lowered and the storage engine might make a much better allocation decision when it knows the full length rather than getting piecemeal requests.

We implement the improved allocation scheme through additional state kept for the duration of the template object copy: struct vbf_objiter_priv holds the pointer to the newly created busy object, the total yet unallocated length (initialized to OA_LEN of the existing object) and pointer/length to the unused portion of the currently open segment, if any.

mbgrydeland

Isn't there a problem with this patch set wrt to revalidation of objects that are still streaming?

That is, due to a TTL that is shorter than the time it takes to fetch the object, a 304 revalidation is started which then needs to stream the data from the still fetching object while creating the revalidated object.

I wonder if this will then be using the wrong length (OA_LEN not yet set, which happens in vbf_stp_fetchend), causing an assert it looks like. It would I think regardless mess up both the allocation sizes and the logic to set VFP_END once and only once.

mbgrydeland · 2023-11-14T15:15:57Z

Isn't there a problem with this patch set wrt to revalidation of objects that are still streaming?

Hmm, seems I was confused about the behaviour here. We do an explicit wait for BOS_FINISHED before creating the revalidated object, so that should be OK.

mbgrydeland · 2023-11-14T15:53:11Z

There is a problem with the patch when revalidating zero length objects. Varnish won't attempt to do a revalidation for these objects (when the length is zero), but a missbehaving backend answering 304 anyways will cause a panic. See the test case.

varnishtest ""

server s1 {
	rxreq
	txresp -nolen -hdr "Content-Length: 0" -hdr {Etag: "foo"}

	rxreq
	txresp -status 304 -nolen -hdr "Content-Length: 0" -hdr {Etag: "foo"}
} -start

varnish v1 -vcl+backend {
	sub vcl_backend_response {
		set beresp.ttl = 1s;
	}
} -start

client c1 {
	txreq
	rxresp

	txreq
	rxresp
} -run

delay 1

client c2 {
	txreq
	rxresp
} -run

delay 0.1

client c3 {
	txreq
	rxresp
} -run

nigoroll · 2023-12-04T10:14:07Z

@mbgrydeland good catch! I will add the test case and then improve this PR.

On the previous comment: Yes, we only copy the stale object once it is complete.

Test case by Martin Blix Grydeland, taken from #4013

nigoroll · 2023-12-04T11:08:46Z

@mbgrydeland I am going to force-push a simple removal of the triggering assertion:

diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c
index 3e0d7cdc3..6fb47975f 100644
--- a/bin/varnishd/cache/cache_fetch.c
+++ b/bin/varnishd/cache/cache_fetch.c
@@ -883,7 +883,6 @@ vbf_stp_condfetch(struct worker *wrk, struct busyobj *bo)
        INIT_OBJ(vop, VBF_OBITER_PRIV_MAGIC);
        vop->bo = bo;
        vop->l = ObjGetLen(bo->wrk, stale_oc);
-       assert(vop->l > 0);
        if (ObjIterate(wrk, stale_oc, vop, vbf_objiterate, 0))
                (void)VFP_Error(bo->vfc, "Template object failed");

IMHO, skipping the iterator for a zero length object would also be an option, but we did call it before this PR and there might be future code depending on it. So let's keep it and have the iterator do its noop.

mbgrydeland · 2023-12-04T13:53:50Z

I'm wondering if there aren't some potential problems unrelated to this patch that we have uncovered. The fact that Varnish will enter 304 revalidation logic even though we never asked for it (didn't send out any IMS/INM headers) is worrying to me. I wonder if we shouldn't keep track somehow of whether to expect a 304, and error out if we still got a 304. Could prevent a future gotcha at least.

nigoroll · 2023-12-04T14:27:10Z

bugwash: waiting for @mbgrydeland's ok

mbgrydeland

Did you end up dropping the panic test case from the PR? I can't see it as part of the commits currently.

bin/varnishd/cache/cache_fetch.c

nigoroll · 2023-12-04T15:25:46Z

Did you end up dropping the panic test case from the PR? I can't see it as part of the commits currently.

I added it directly to trunk: 3fc690d 9a4653b

@mbgrydeland sorry for accidentally re-requesting your review too early.

It is my understanding that the objtrimstore stevedore API function is only to be called once when the object is complete, which I believe is also in line with the comment on ObjExtend() that "The final flag must be set on the last call". If this understanding of the API is correct, we did not adhere to it in the fetch code when we made a copy of an existing stale object after a 304 response: There, we iterated over the stale object and did not set the final flag just once when the object was complete, but rather after each storage segment was copied. This commit fixes this, adds some pedentry to the simple storage and extends b00062.vtc to test this behavior specifically. On top, g6.vtc also triggered without the fix but the duplicate trim detection in place. This issue has originally surfaced in the SLASH/fellow storage where the trimstore function implicitly asserted to only be called once. Ref 115742b Ref https://gitlab.com/uplex/varnish/slash/-/issues/33

When copying a stale object after a 304 revalidation, we iterated over it and allocated new storage for each storage segment. So, at best, we kept the fragmentation of the existing object, or made it even worse. This is particularly relevant when the existing object was created from a chunked response, in which case the original segments might have been particularly small and, consequently, many in number. Because we wait for the stale object to complete, we know the total length (OA_LEN) upfront, and can ask the storage serving the copy for exactly the right length. This is much more efficient, as fragmentation is lowered and the storage engine might make a much better allocation decision when it knows the full length rather than getting piecemeal requests. We implement the improved allocation scheme through additional state kept for the duration of the template object copy: struct vbf_objiter_priv holds the pointer to the newly created busy object, the total yet unallocated length (initialized to OA_LEN of the existing object) and pointer/length to the unused portion of the currently open segment, if any.

we need a signal if the fix is in

Ref 5583667 Ref #4013

nigoroll changed the title ~~Call trimstore only once when copying the body after a 304~~ Fix and optimize 304 object copying Nov 1, 2023

dridi requested a review from mbgrydeland November 13, 2023 14:58

mbgrydeland reviewed Nov 14, 2023

View reviewed changes

nigoroll added a commit that referenced this pull request Dec 4, 2023

Add test case for 304 response on zero-sized object

3fc690d

Test case by Martin Blix Grydeland, taken from #4013

nigoroll force-pushed the fetch_304_dup_trimstore branch from 8ec7719 to d7e0b26 Compare December 4, 2023 11:09

nigoroll mentioned this pull request Dec 4, 2023

Do we handle 304 responses correctly? #4026

Open

mbgrydeland reviewed Dec 4, 2023

View reviewed changes

bin/varnishd/cache/cache_fetch.c Outdated Show resolved Hide resolved

nigoroll requested a review from mbgrydeland December 4, 2023 15:23

nigoroll added 2 commits December 4, 2023 19:49

nigoroll force-pushed the fetch_304_dup_trimstore branch from d7e0b26 to 59fe52c Compare December 4, 2023 18:49

mbgrydeland approved these changes Dec 5, 2023

View reviewed changes

dridi merged commit 98b6b95 into varnishcache:master Dec 5, 2023

nigoroll deleted the fetch_304_dup_trimstore branch December 5, 2023 20:42

nigoroll added a commit that referenced this pull request Dec 5, 2023

VRT minor version bump for #4013

5583667

we need a signal if the fix is in

nigoroll added a commit that referenced this pull request Dec 7, 2023

Document VRT 18.1 change

8c0c414

Ref 5583667 Ref #4013

nigoroll mentioned this pull request Dec 18, 2023

Assert error in sml_trimstore() #3816

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix and optimize 304 object copying #4013

Fix and optimize 304 object copying #4013

Uh oh!

nigoroll commented Oct 31, 2023 •

edited

Loading

Uh oh!

mbgrydeland left a comment

Uh oh!

mbgrydeland commented Nov 14, 2023

Uh oh!

mbgrydeland commented Nov 14, 2023

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

mbgrydeland commented Dec 4, 2023

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

mbgrydeland left a comment

Uh oh!

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

Uh oh!

Fix and optimize 304 object copying #4013

Fix and optimize 304 object copying #4013

Uh oh!

Conversation

nigoroll commented Oct 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commit 1: Call trimstore only once when copying the body after a 304

Commit 2: Optimize 304 template object copying

Uh oh!

mbgrydeland left a comment

Choose a reason for hiding this comment

Uh oh!

mbgrydeland commented Nov 14, 2023

Uh oh!

mbgrydeland commented Nov 14, 2023

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

mbgrydeland commented Dec 4, 2023

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

mbgrydeland left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nigoroll commented Dec 4, 2023

Uh oh!

Uh oh!

nigoroll commented Oct 31, 2023 •

edited

Loading