-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HEAD requestes changed to GET #2107
Comments
Yes, varnish has "always" done this for misses - the idea being that caching the whole object makes more sense than just the headers, but interestingly it actually looks undocumented in the tree, but finding the explanation is easy: http://www.gossamer-threads.com/lists/varnish/misc/14319 |
Backport review: Backported as c9598de. |
Hi,
I implemented the following, is that the correct way to do it? sub vcl_hash {
hash_data(req.method);
}
sub vcl_recv {
set req.http.X-Original-Method = req.method;
}
sub vcl_backend_fetch {
if (bereq.http.X-Original-Method == "HEAD") {
set bereq.method = bereq.http.X-Original-Method;
}
} This seems to work but I have some questions:
|
Right now headers are the only pseudo-variables we have, and the only thing that transfers between client and backend contexts. My suggestion to solve this would be a new |
regarding documentation: @sbraz, would you mind having a look at https://www.varnish-software.com/developers/tutorials/caching-post-requests-varnish/ and tell us if it's missing information from your perspective? |
@dridi thanks for the explanation. Then why does the doc read "header or variable"? Shouldn't the variable part just be dropped as it could be confusing (it did confuse me)?
Do you mean to solve the issue of satisfying HEADs from previously cached GETs? @gquintard, I've read the the documentation and it does cover a similar use case, yes. However, I think having something specific to HEAD requests would still be useful as it seems different. In the POST example, Can you confirm my VCL is valid and could be included in the documentation (maybe just as a snippet here) ? Sorry to insist but could you also explain what causes Varnish to not include the Unrelated: https://www.varnish-software.com/developers/tutorials/varnish-builtin-vcl/ contains a typo ("response headeres"), can you please fix it? |
I agree that this looks confusing. The documentation should probably be updated but @nigoroll recently announced a substantial update to the documentation so I would probably revisit this afterwards (not sure when though).
No, you can already do that out of the box. A cache miss on a HEAD request triggers a GET request, stores the whole response, and only replies with response headers to the client. A cache hit will also produce a proper HEAD response. The problem my suggestion addresses is that any cache miss turns into a GET, this is not limited to HEAD requests (the only method for which this is appropriate). You shouldn't need a workaround at all. HEAD is handled transparently by Varnish. A workaround is needed when you need to cache the response for a different method like POST.
You receive a client request ( Regarding your original snippet, it would actually look like this if you wanted to avoid sending the header to your backend: sub vcl_recv {
unset req.http.X-Fetch-Method; # prevent header injection
}
sub vcl_hash {
if (req.method != "GET" && req.method != "HEAD") {
hash_data(req.method);
}
}
sub vcl_miss {
if (req.method != "GET" && req.method != "HEAD") {
set req.http.X-Fetch-Method = req.method;
}
}
sub vcl_backend_fetch {
if (bereq.http.X-Fetch-Method) {
set bereq.method = bereq.http.X-Fetch-Method;
}
unset bereq.method = bereq.http.X-Fetch-Method;
} This snippet treats GET/HEAD as the same, and only overrides non-HEAD cache misses. If we bypass the cache (
@gquintard one two three not it! Finally, replying to myself:
I think a better name (considering existing flags) would be |
Maybe my initial problem was not clear. I host large files for Linux distribution mirroring. Said distributions tend to crawl the mirror daily and send a large amount of HEAD requests. At the moment, all these HEAD requests result in GETs to the backend, which is wasteful.
If there isn't, it's not a deal-breaker. I can live with duplicated objects when a GET came before a HEAD, I just want to know if keeping the out-of-the-box optimisation is possible.
You are right, the header did get sent to the backend, my bad. This makes sense: if it's not unset, it should be present. I understand this part better now.
Your example VCL doesn't really work for my use case, it sends GET for HEAD, isn't it meant for caching POST requests instead? Also the last line from I tried to tweak it to set the header in sub vcl_recv {
unset req.http.X-Fetch-Method; # prevent header injection
}
sub vcl_hash {
# If I kept the "HEAD" part from your code, a client GET after a client HEAD returned the result from HEAD
if (req.method != "GET") {
hash_data(req.method);
}
}
sub vcl_miss {
# If I kept the "HEAD" from your code, Varnish kept sending GETs to the backend for HEADs
if (req.method != "GET") {
set req.http.X-Fetch-Method = req.method;
}
}
sub vcl_backend_fetch {
if (bereq.http.X-Fetch-Method) {
set bereq.method = bereq.http.X-Fetch-Method;
}
unset bereq.http.X-Fetch-Method;
} To me, this works more or less like my previous VCL (plus the Thanks again for your help, it is much appreciated. |
You can actually do that, however it's a little tricky: sub vcl_recv {
unset bereq.http.X-Fetch-Method;
if (req.restarts == 1) {
return (hash);
}
}
sub vcl_hash {
if (req.restarts == 1) {
hash_data(req.method);
}
}
sub vcl_miss {
if (req.method == "HEAD") {
if (req.restarts == 0) {
return (restart);
}
set req.http.X-Fetch-Method = "HEAD";
}
}
sub vcl_backend_fetch {
if (bereq.http.X-Fetch-Method) {
set bereq.method = bereq.http.X-Fetch-Method;
}
unset bereq.http.X-Fetch-Method;
} If a cache entry exists for a GET request, you can reuse it for subsequent HEAD requests. Otherwise, exceptionally send a HEAD request to the backend with a different cache key. Restarts tend to make the VCL state machine harder to reason about, use them sparingly. |
Thanks a lot, now that is something that could be worth adding to the documentation :) There's a typo in If I understand correctly (thanks in part to
On the other hand, if the first request is a HEAD:
This would be the only place in my VCL where I use them. I assume it's safe to reuse your snippet then? I have one last question regarding backend requests. With the default behaviour of GET for all frontend requests, if I check |
@dridi I noticed that all cached HEAD requests have |
Correct, you already processed the original request once. Processing the request twice could "corrupt" it if operations are not idempotent.
It's probably fine. It becomes really messy once you have more than one reason to restart.
Sounds like a bug, maybe open a new issue describing this behavior. It might actually be solved by #4213, can you give it a try first?
Sounds like a bug, please open a new issue with steps to reproduce. |
Thanks, I created #4244 for Once the |
Previously, we would only keep the Content-Length header for HEAD requests on hit-for-miss objects, now we simply keep it always to enable "fallback" caching of HEAD requests. The added vtc implements the basics of the logic to enable the (reasonable) use case documented in varnishcache#2107 (comment) but using Vary instead of cache key modification plus restart. Fixes varnishcache#4245
Previously, we would only keep the Content-Length header for HEAD requests on hit-for-miss objects, now we simply keep it always to enable "fallback" caching of HEAD requests. The added vtc implements the basics of the logic to enable the (reasonable) use case documented in varnishcache#2107 (comment) but using Vary instead of cache key modification plus restart. Fixes varnishcache#4245
Previously, we would only keep the Content-Length header for HEAD requests on hit-for-miss objects, now we simply keep it always to enable "fallback" caching of HEAD requests. The added vtc implements the basics of the logic to enable the (reasonable) use case documented in varnishcache#2107 (comment) but using Vary instead of cache key modification plus restart. Fixes varnishcache#4245
It looks like varnish replaces HEAD requests to GET in before
vcl_backend_fetch
.Not sure if this is bug or feature, but I didn't found anything about that in documentation.
VCL looks like (less important parts was cut-off):
varnishlog (headers: X-Req-Method-Entry and X-Req-Method):
My workaround is to add
vcl_backend_fetch
subroutine and set thebereq.method
usingX-Req-Method-Entry
header set invcl_recv
:varnishd version:
OS:
The text was updated successfully, but these errors were encountered: