HEAD requestes changed to GET #2107

b3k · 2016-10-04T09:42:02Z

It looks like varnish replaces HEAD requests to GET in before vcl_backend_fetch.

Not sure if this is bug or feature, but I didn't found anything about that in documentation.

VCL looks like (less important parts was cut-off):

sub vcl_recv {
set req.http.X-Req-Method-Entry = req.method;
return (hash);
}

sub vcl_backend_response {
set beresp.http.X-Varnish-Method-Entry = bereq.http.X-Req-Method-Entry;
set beresp.http.X-Req-Method = bereq.method;
return (deliver);
}

varnishlog (headers: X-Req-Method-Entry and X-Req-Method):

-   ReqStart       185.24.255.255 16410
-   ReqMethod      HEAD
-   ReqURL         /api/rest/health
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: 178.216.255.255
-   ReqHeader      User-Agent: curl/7.50.3
-   ReqHeader      Accept: */*
-   ReqHeader      X-Forwarded-For: 185.24.255.255
-   VCL_call       RECV
-   ReqUnset       Host: 178.216.255.255
-   ReqHeader      Host: 178.216.255.255
-   ReqURL         /api/rest/health
-   ReqHeader      X-Varnish-URL: /api/rest/health
-   ReqHeader      Surrogate-Capability: key=ESI/1.0
-   ReqHeader      X-Req-Method-Entry: HEAD
-   VCL_return     hash
-   VCL_call       HASH
-   VCL_return     lookup
-   VCL_call       MISS
-   VCL_return     fetch
-   Link           bereq 32771 fetch
-   Timestamp      Fetch: 1475573555.869367 0.079345 0.079345
-   RespProtocol   HTTP/1.1
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Server: nginx
-   RespHeader     Date: Tue, 04 Oct 2016 09:32:35 GMT
-   RespHeader     Content-Type: application/json; charset=utf-8
-   RespHeader     Content-Length: 2
-   RespHeader     X-Powered-By: React/alpha
-   RespHeader     Set-Cookie: frontend=d8ce5dc1d896a1c3ef344f7a1b; expires=Tue, 04 Oct 2016 12:32:35 GMT; max-age=10800; domain=test.net; path=/; httponly
-   RespHeader     X-Varnish-URL: /api/rest/health
-   RespHeader     X-Varnish-Method-Entry: HEAD
-   RespHeader     X-Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
-   RespHeader     X-Req-Method: GET
-   RespHeader     X-Passed: 200
-   RespHeader     X-Varnish: 32770
-   RespHeader     Age: 0
-   RespHeader     Via: 1.1 varnish-v4
-   VCL_call       DELIVER
-   RespHeader     X-Cache: MISS
-   RespHeader     X-Cache-Hits: 0
-   RespUnset      X-Powered-By: React/alpha
-   RespUnset      Server: nginx
-   RespUnset      X-Varnish: 32770
-   RespUnset      Via: 1.1 varnish-v4
-   RespUnset      X-Varnish-URL: /api/rest/health
-   VCL_return     deliver

My workaround is to add vcl_backend_fetch subroutine and set the bereq.method using X-Req-Method-Entry header set in vcl_recv:

sub vcl_backend_fetch {
set bereq.method = bereq.http.X-Req-Method-Entry;
return (fetch);
}

varnishd version:

/# varnishd -V
varnishd (varnish-4.1.3 revision 5e3b6d2)

OS:

Linux 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

The text was updated successfully, but these errors were encountered:

nigoroll · 2016-10-04T14:54:27Z

Yes, varnish has "always" done this for misses - the idea being that caching the whole object makes more sense than just the headers, but interestingly it actually looks undocumented in the tree, but finding the explanation is easy: http://www.gossamer-threads.com/lists/varnish/misc/14319
Notice that if you actually do want to cache HEADs as separate objects, you need to change the cache key.

(Backport of 2bbcde1 with the fix b698aec squashed in) doc fixes #2107

hermunn · 2016-10-13T08:56:41Z

Backport review: Backported as c9598de.

sbraz · 2024-12-06T20:51:01Z

Hi,
Would it be possible to add a configuration example to the documentation in addition to this paragraph please?

In particular, to cache non-GET requests, req.method needs to be
saved to a header or variable in :ref:vcl_recv and restored to
bereq.method. Notice that caching non-GET requests typically also
requires changing the cache key in :ref:vcl_hash e.g. by also
hashing the request method and/or request body.

I implemented the following, is that the correct way to do it?

sub vcl_hash {
    hash_data(req.method);
}

sub vcl_recv {
    set req.http.X-Original-Method = req.method;
}

sub vcl_backend_fetch {
    if (bereq.http.X-Original-Method == "HEAD") {
        set bereq.method = bereq.http.X-Original-Method;
    }
}

This seems to work but I have some questions:

what causes Varnish to not include the X-Original-Method in requests sent to the backend? This is the expected behaviour for me but I'm curious as to why this happens because it seems like I added a header to the request object.
with this setup, if a GET was cached, it can't be used to satisfy subsequent HEAD requests, is there a workaround?
the doc mentions "needs to be saved to a header or variable". In the example above, I'm saving the method to a header, how would I save it to a variable?

dridi · 2024-12-09T07:34:41Z

Right now headers are the only pseudo-variables we have, and the only thing that transfers between client and backend contexts.

My suggestion to solve this would be a new req.hash_always_get flag (true by default to match the current behavior) to decide whether to turn any cache miss into a GET. We could then add an (optional?) OA_METHOD object attribute to account for the method during lookups.

gquintard · 2024-12-09T13:17:05Z

regarding documentation: @sbraz, would you mind having a look at https://www.varnish-software.com/developers/tutorials/caching-post-requests-varnish/ and tell us if it's missing information from your perspective?

sbraz · 2024-12-10T18:50:02Z

@dridi thanks for the explanation. Then why does the doc read "header or variable"? Shouldn't the variable part just be dropped as it could be confusing (it did confuse me)?

My suggestion to solve this

Do you mean to solve the issue of satisfying HEADs from previously cached GETs?

@gquintard, I've read the the documentation and it does cover a similar use case, yes. However, I think having something specific to HEAD requests would still be useful as it seems different.

In the POST example, vcl_recv contains an explicit return (hash), my VCL does not and yet it works, why? Is it because the built-in VCL for vcl_recv already returns hash for HEAD requests but not for POST requests?
Also, in my VCL, there is no need to unset the X-Original-Method header as it gets overwritten in any case, right?

Can you confirm my VCL is valid and could be included in the documentation (maybe just as a snippet here) ?

Sorry to insist but could you also explain what causes Varnish to not include the X-Original-Method in requests sent to the backend? This is the expected behaviour for me but I'm curious as to why this happens because it seems like I added a header to the request object. Maybe this is documented somewhere, but if it isn't, could something be added to the "VCL Steps" doc? I couldn't find anything explaining when frontend request headers are copied to backend header requests and why mine isn't.

Unrelated: https://www.varnish-software.com/developers/tutorials/varnish-builtin-vcl/ contains a typo ("response headeres"), can you please fix it?

dridi · 2024-12-11T16:46:55Z

@dridi thanks for the explanation. Then why does the doc read "header or variable"? Shouldn't the variable part just be dropped as it could be confusing (it did confuse me)?

I agree that this looks confusing. The documentation should probably be updated but @nigoroll recently announced a substantial update to the documentation so I would probably revisit this afterwards (not sure when though).

My suggestion to solve this

Do you mean to solve the issue of satisfying HEADs from previously cached GETs?

No, you can already do that out of the box. A cache miss on a HEAD request triggers a GET request, stores the whole response, and only replies with response headers to the client. A cache hit will also produce a proper HEAD response.

The problem my suggestion addresses is that any cache miss turns into a GET, this is not limited to HEAD requests (the only method for which this is appropriate).

You shouldn't need a workaround at all. HEAD is handled transparently by Varnish.

A workaround is needed when you need to cache the response for a different method like POST.

Sorry to insist but could you also explain what causes Varnish to not include the X-Original-Method in requests sent to the backend? This is the expected behaviour for me but I'm curious as to why this happens because it seems like I added a header to the request object.

You receive a client request (req) and if you need to trigger a fetch (for example because of a cache miss) then a backend request (bereq) is derived from the client request (that you may have modified) in the state it was when the fetch got triggered (modulus other changes like a x-varnish header).

Regarding your original snippet, it would actually look like this if you wanted to avoid sending the header to your backend:

sub vcl_recv {
    unset req.http.X-Fetch-Method; # prevent header injection
}

sub vcl_hash {
    if (req.method != "GET" && req.method != "HEAD") {
        hash_data(req.method);
    }
}

sub vcl_miss {
    if (req.method != "GET" && req.method != "HEAD") {
        set req.http.X-Fetch-Method = req.method;
    }
}

sub vcl_backend_fetch {
    if (bereq.http.X-Fetch-Method) {
        set bereq.method = bereq.http.X-Fetch-Method;
    }
    unset bereq.method = bereq.http.X-Fetch-Method;
}

This snippet treats GET/HEAD as the same, and only overrides non-HEAD cache misses. If we bypass the cache (return (pass)) Varnish leaves the method alone. After conditionally overriding the backend request method, the header is removed. This way the override logic is more robust and can handle retries.

Unrelated: https://www.varnish-software.com/developers/tutorials/varnish-builtin-vcl/ contains a typo ("response headeres"), can you please fix it?

@gquintard one two three not it!

Finally, replying to myself:

My suggestion to solve this would be a new req.hash_always_get flag [...]

I think a better name (considering existing flags) would be req.miss_always_get.

sbraz · 2024-12-11T17:30:03Z

No, you can already do that out of the box. A cache miss on a HEAD request triggers a GET request, stores the whole response, and only replies with response headers to the client. A cache hit will also produce a proper HEAD response.

Maybe my initial problem was not clear. I host large files for Linux distribution mirroring. Said distributions tend to crawl the mirror daily and send a large amount of HEAD requests. At the moment, all these HEAD requests result in GETs to the backend, which is wasteful.
My goal is to send HEAD requests to the backend for HEAD requests to the frontend. However, when I do this, the out-of-the-box behaviour you refer to no longer works. I was wondering if there was a way to:

cache HEADs separately
still satisfy HEAD requests from previous GETs

If there isn't, it's not a deal-breaker. I can live with duplicated objects when a GET came before a HEAD, I just want to know if keeping the out-of-the-box optimisation is possible.

Regarding your original snippet, it would actually look like this if you wanted to avoid sending the header to your backend

You are right, the header did get sent to the backend, my bad. This makes sense: if it's not unset, it should be present. I understand this part better now.

This snippet treats GET/HEAD as the same, and only overrides non-HEAD cache misses

Your example VCL doesn't really work for my use case, it sends GET for HEAD, isn't it meant for caching POST requests instead? Also the last line from vcl_backend_fetch should be unset bereq.http.X-Fetch-Method.

I tried to tweak it to set the header in vcl_miss and it does work but I end up with the same result as my previous VCL where the header was set in vcl_recv. The following works fine but if I send a client HEAD after a GET, Varnish still sends a HEAD to the backend. Am I missing something?

sub vcl_recv {
    unset req.http.X-Fetch-Method; # prevent header injection
}

sub vcl_hash {
    # If I kept the "HEAD" part from your code, a client GET after a client HEAD returned the result from HEAD
    if (req.method != "GET") {
        hash_data(req.method);
    }
}

sub vcl_miss {
    # If I kept the "HEAD" from your code, Varnish kept sending GETs to the backend for HEADs
    if (req.method != "GET") { 
        set req.http.X-Fetch-Method = req.method;
    }       
}

sub vcl_backend_fetch {
    if (bereq.http.X-Fetch-Method) {
        set bereq.method = bereq.http.X-Fetch-Method;
    }
    unset bereq.http.X-Fetch-Method;
}

To me, this works more or less like my previous VCL (plus the unset bit which is nice). Why is it better to set the header in vcl_miss instead of vcl_recv?

Thanks again for your help, it is much appreciated.

dridi · 2024-12-12T07:36:10Z

You can actually do that, however it's a little tricky:

sub vcl_recv {
    unset bereq.http.X-Fetch-Method;
    if (req.restarts == 1) {
        return (hash);
    }
}

sub vcl_hash {
    if (req.restarts == 1) {
        hash_data(req.method);
    }
}

sub vcl_miss {
    if (req.method == "HEAD") {
        if (req.restarts == 0) {
            return (restart);
        }
        set req.http.X-Fetch-Method = "HEAD";
    }
}

sub vcl_backend_fetch {
    if (bereq.http.X-Fetch-Method) {
        set bereq.method = bereq.http.X-Fetch-Method;
    }
    unset bereq.http.X-Fetch-Method;
}

If a cache entry exists for a GET request, you can reuse it for subsequent HEAD requests. Otherwise, exceptionally send a HEAD request to the backend with a different cache key.

Restarts tend to make the VCL state machine harder to reason about, use them sparingly.

gquintard · 2024-12-12T08:00:49Z

@dridi, PR posted to the dev portal, @sbraz, thank you for pointing it out!

sbraz · 2024-12-13T20:17:15Z

You can actually do that, however it's a little tricky:

Thanks a lot, now that is something that could be worth adding to the documentation :) There's a typo in vcl_recv, it should unset unset req.http.X-Fetch-Method; but other than that, it works perfectly.

If I understand correctly (thanks in part to std.log), assuming the first request is a GET:

vcl_recv doesn't do anything special (except unsetting the internal header), neither do vcl_hash, vcl_miss or vcl_backend_fetch.
for a GET on the same URL, vcl_recv doesn't do anything special either and neither does vcl_hash. Then it's a hit.
a HEAD on the same URL behaves similarly, none of the conditions are entered, so we satisfy the HEAD from the cached GET, as per the documentation.

On the other hand, if the first request is a HEAD:

vcl_miss triggers a restart of the request processing.
vcl_recv immediately returns hash, I assume this is done to avoid executing the rest of the function?
vcl_hash hashes the HEAD method in addition to other parameters.
vcl_miss sets req.http.X-Fetch-Method.
vcl_backend_fetch performs a HEAD on the backend
for a HEAD on the same URL, we still enter vcl_miss once and the processing restarts, but then the hash is updated with the HEAD method, so we end up with a hit

Restarts tend to make the VCL state machine harder to reason about, use them sparingly.

This would be the only place in my VCL where I use them. I assume it's safe to reuse your snippet then?

I have one last question regarding backend requests. With the default behaviour of GET for all frontend requests, if I check varnishncsa's %r for a backend request, it contains HEAD /path although the method actually being used is GET, is that expected? The same goes for BereqMethod in varnishlog. Is this because BereqMethod is set before vcl_backend_fetch is executed?

sbraz · 2024-12-19T16:20:23Z

@dridi I noticed that all cached HEAD requests have Content-Length: 0 and it causes problems (this doesn't happen if a HEAD is satisfied from GET). Is that a Varnish bug or am I missing something? I'm running varnish-7.1.1 revision 7cee1c581bead20e88d101ab3d72afb29f14d87a.
EDIT: Same problem with varnish-7.6.1 revision c3d5882003eb87e5e93dc09fb9513ca96db3ca3c.

dridi · 2024-12-30T09:34:14Z

vcl_recv immediately returns hash, I assume this is done to avoid executing the rest of the function?

Correct, you already processed the original request once. Processing the request twice could "corrupt" it if operations are not idempotent.

This would be the only place in my VCL where I use them. I assume it's safe to reuse your snippet then?

It's probably fine. It becomes really messy once you have more than one reason to restart.

With the default behaviour of GET for all frontend requests, if I check varnishncsa's %r for a backend request, it contains HEAD /path although the method actually being used is GET, is that expected? The same goes for BereqMethod in varnishlog. Is this because BereqMethod is set before vcl_backend_fetch is executed?

Sounds like a bug, maybe open a new issue describing this behavior. It might actually be solved by #4213, can you give it a try first?

I noticed that all cached HEAD requests have Content-Length: 0 and it causes problems (this doesn't happen if a HEAD is satisfied from GET).

Sounds like a bug, please open a new issue with steps to reproduce.

sbraz · 2024-12-30T16:57:05Z

Thanks, I created #4244 for BereqMethod shown as HEAD although a GET is sent. Actually the biggest problem was that varnishncsa showed HEAD for %r but #4213 fixes this.
And, more importantly, #4245 for the zero Content-Length problem.

Once the Content-Length problem is fixed, could the example VCL with restarts be added to the documentation somewhere?

Previously, we would only keep the Content-Length header for HEAD requests on hit-for-miss objects, now we simply keep it always to enable "fallback" caching of HEAD requests. The added vtc implements the basics of the logic to enable the (reasonable) use case documented in varnishcache#2107 (comment) but using Vary instead of cache key modification plus restart. Fixes varnishcache#4245

nigoroll closed this as completed in 2bbcde1 Oct 4, 2016

vssync added the backport-review-4.1 label Oct 4, 2016

hermunn pushed a commit that referenced this issue Oct 13, 2016

document bereq preparation in core

c9598de

(Backport of 2bbcde1 with the fix b698aec squashed in) doc fixes #2107

hermunn removed the backport-review-4.1 label Oct 13, 2016

javiereguiluz mentioned this issue Feb 25, 2017

Unacessible routes in prod because HEAD request becomes GET symfony/symfony#21074

Closed

sbraz mentioned this issue Dec 30, 2024

Content-Length is always 0 for content obtained from a response to a HEAD request #4245

Open

nigoroll mentioned this issue Jan 2, 2025

cache_req_fsm: keep the cache object's Content-Length for HEAD always #4247

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HEAD requestes changed to GET #2107

HEAD requestes changed to GET #2107

b3k commented Oct 4, 2016 •

edited

Loading

nigoroll commented Oct 4, 2016

hermunn commented Oct 13, 2016

sbraz commented Dec 6, 2024 •

edited

Loading

dridi commented Dec 9, 2024

gquintard commented Dec 9, 2024

sbraz commented Dec 10, 2024

dridi commented Dec 11, 2024

sbraz commented Dec 11, 2024

dridi commented Dec 12, 2024

gquintard commented Dec 12, 2024

sbraz commented Dec 13, 2024

sbraz commented Dec 19, 2024 •

edited

Loading

dridi commented Dec 30, 2024

sbraz commented Dec 30, 2024 •

edited

Loading

HEAD requestes changed to GET #2107

HEAD requestes changed to GET #2107

Comments

b3k commented Oct 4, 2016 • edited Loading

nigoroll commented Oct 4, 2016

hermunn commented Oct 13, 2016

sbraz commented Dec 6, 2024 • edited Loading

dridi commented Dec 9, 2024

gquintard commented Dec 9, 2024

sbraz commented Dec 10, 2024

dridi commented Dec 11, 2024

sbraz commented Dec 11, 2024

dridi commented Dec 12, 2024

gquintard commented Dec 12, 2024

sbraz commented Dec 13, 2024

sbraz commented Dec 19, 2024 • edited Loading

dridi commented Dec 30, 2024

sbraz commented Dec 30, 2024 • edited Loading

b3k commented Oct 4, 2016 •

edited

Loading

sbraz commented Dec 6, 2024 •

edited

Loading

sbraz commented Dec 19, 2024 •

edited

Loading

sbraz commented Dec 30, 2024 •

edited

Loading