Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP request and response egress validation #3973

Closed

Conversation

walid-git
Copy link
Member

This implements HTTP character set validation run on the struct http after vcl_deliver, vcl_synth and vcl_backend_fetch. The idea is to pick up if the VCL program has ended up filling in illegal values that would violate the HTTP protocol.

If a problem is found, we will error out with a 503. If the error is caught in vcl_synth, the connection will be closed.

Authored by:
@mbgrydeland
@daghf

@nigoroll
Copy link
Member

nigoroll commented Sep 4, 2023

bugwash:

  • could/should this be implemented at the transport level?
  • but on the other hand, we should probably have the option to go to vcl_synth / vcl_error for error page customization

My personal vote goes to the latter aspect. I think we need to keep the option for customization, even if malformed responses were rightly considered an internal error.

@walid-git walid-git force-pushed the upstream_req_resp_validation branch 2 times, most recently from f6e9338 to f0cda9e Compare December 4, 2023 14:01
@walid-git walid-git force-pushed the upstream_req_resp_validation branch from f0cda9e to 1b7e9d0 Compare February 5, 2024 13:00
@walid-git
Copy link
Member Author

This PR should probably be discussed again in bugwash regarding error page customization brought up by Nils.

mbgrydeland and others added 4 commits May 13, 2024 11:56
This implements HTTP character set validation run on the `struct http`
after vcl_deliver, vcl_synth and vcl_backend_fetch. The idea is to pick up
if the VCL program has ended up filling in illegal values that would
violate the HTTP protocol.

If a problem is found, we will error out with a 503. If the error is
caught in vcl_synth, the connection will be closed.
There is no need to validate client response headers that we do not
intend to transmit.
@walid-git walid-git force-pushed the upstream_req_resp_validation branch from 1b7e9d0 to 504d274 Compare May 13, 2024 10:02
@walid-git
Copy link
Member Author

Bugwash: merge this with the existing FEATURE_BIT(VALIDATE_HEADERS) logic.

@walid-git
Copy link
Member Author

From last bugwash:

<slink> Along these lines, I would think it makes more sense to validate headers "on the way in and upon each change", rather than (again) at agress?

On a second thought, do we really want to make a task fail as soon as an invalid value is written to a header (or any other (be)req/(be)resp field) ? We might have users using headers as temporary variables in VCL and those might contain invalid data, but this should be fine as long as the invalid headers are used internally and removed before the req/resp leaves varnish. That's why I think having a full final check at the very end of VCL processing is a better approach.

@nigoroll
Copy link
Member

nigoroll commented Sep 2, 2024

Random notes from bugwash of my take on this:

  • I really think we should stick to the VALIDATE_HEADERS logic and check for any invalid headers as soon as we see them, and not as an afterthought before sending them
  • Use of headers as variables is bad practice. Use variables as variables.
  • Detecting any errors early is POLA and good for locating the source of the error

@dridi
Copy link
Member

dridi commented Sep 17, 2024

I strongly disagree with the arguments against validation of what we send. We needlessly prevent iterative modifications where intermediate steps may not be correct. See for example this snippet that was for a long time _the_ way to filter cookies in VCL:

  if (req.http.Cookie) {
    /* Warning: Not a pretty solution */
    /* Prefix header containing cookies with ';' */
    set req.http.Cookie = ";" + req.http.Cookie;
    /* Remove any spaces after ';' in header containing cookies */
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
    /* Prefix cookies we want to preserve with one space */
    /* 'S{1,2}ESS[a-z0-9]+' is the regular expression matching a Drupal session cookie ({1,2} added for HTTPS support) */
    /* 'NO_CACHE' is usually set after a POST request to make sure issuing user see the results of his post */
    /* Keep in mind we should add here any cookie that should reach the backend such as splahs avoiding cookies */
    set req.http.Cookie = regsuball(req.http.Cookie, ";(S{1,2}ESS[a-z0-9]+|NO_CACHE|OATMEAL|CHOCOLATECHIP)=", "; \1=");
    /* Remove from the header any single Cookie not prefixed with a space until next ';' separator */
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    /* Remove any '; ' at the start or the end of the header */
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

    if (req.http.Cookie == "") {
      /* If there are no remaining cookies, remove the cookie header. */
      unset req.http.Cookie;
    }
  }

And while admittedly "not a pretty solution" (and clearly not the most efficient) it is still a construct someone may use to get to a valid point.

From what I remember the last consensus on the topic of variables was:

  • adding (req|bereq|beresp|resp|obj).var.* symbols
  • having them be STRING like *.http.*
  • storing them with a new OA_VARS similar to OA_HEADERS

Once we have something in VCL I may consider hardening this check. But right now this is solving a problem, and it is better to start lenient and harden later instead of starting with something prone to breaking existing valid setups to later relax it. Going crescendo would make this change more approachable.

If we had terrible logs that would make it nigh impossible to pinpoint when a header turned wrong, I could hear the POLA argument, but we do a more-than-decent job in that department. (dare I say Varnish shines when it comes to logging?)

@nigoroll
Copy link
Member

nigoroll commented Sep 23, 2024

@dridi what you ask for exists except for OA_VARS (which I support adding), just not as native symbols but rather vmod objects.
The code you quoted is a particularly horrible example (you hint at that) and I think we should really start to seriously deprecate using headers for invalid data. If VALIDATE_HEADERS is considered too drastic, I am fine with an intermediate step, but as strongly as you disagree, I disagree with first accepting invalid stuff and then, after the fact, failing. What about vmods which might use HEADERS for other http requests? What about VSL?
Just the other day I got a question along the lines of "why does X-Lastname: Müller not work?" and even attempting to set something like this in VCL should fail (optionally disabled by a feature flag).

@nigoroll
Copy link
Member

During today's BugWashWed, I think that I made up my mind about this:

  • If FEATURE_VALIDATE_HEADERS is not strict enough, it should be fixed.
  • If FEATURE_VALIDATE_HEADERS does not check url, status, reason, it should be fixed.
  • For cases where FEATURE_VALIDATE_HEADERS is deliberately disabled, vmod_std should have explicit check functions with the same checks as FEATURE_VALIDATE_HEADERS has if enabled, but explicitly controlled from VCL

@nigoroll
Copy link
Member

bugwash: close for now, @walid-git will work on a new proposal in the direction of what bugwash agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants