Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect header validation #3310

Open
kenballus opened this issue Aug 12, 2023 · 3 comments
Open

Incorrect header validation #3310

kenballus opened this issue Aug 12, 2023 · 3 comments
Labels

Comments

@kenballus
Copy link

kenballus commented Aug 12, 2023

Header names

RFC 9110 says that HTTP header names are permitted to contain only the following characters:

"!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA

Tornado allows the following invalid bytes inside of header names:

['\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07', '\x08', '\x09', '\x0b', '\x0c', '\x0d', '\x0e', '\x0f', '\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17', '\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f', '\x20', '\x22', '\x28', '\x29', '\x2c', '\x2f', '\x3b', '\x3c', '\x3d', '\x3e', '\x3f', '\x40', '\x5b', '\x5c', '\x5d', '\x7b', '\x7d', '\x7f', '\x80', '\x81', '\x82', '\x83', '\x84', '\x85', '\x86', '\x87', '\x88', '\x89', '\x8a', '\x8b', '\x8c', '\x8d', '\x8e', '\x8f', '\x90', '\x91', '\x92', '\x93', '\x94', '\x95', '\x96', '\x97', '\x98', '\x99', '\x9a', '\x9b', '\x9c', '\x9d', '\x9e', '\x9f', '\xa0', '\xa1', '\xa2', '\xa3', '\xa4', '\xa5', '\xa6', '\xa7', '\xa8', '\xa9', '\xaa', '\xab', '\xac', '\xad', '\xae', '\xaf', '\xb0', '\xb1', '\xb2', '\xb3', '\xb4', '\xb5', '\xb6', '\xb7', '\xb8', '\xb9', '\xba', '\xbb', '\xbc', '\xbd', '\xbe', '\xbf', '\xc0', '\xc1', '\xc2', '\xc3', '\xc4', '\xc5', '\xc6', '\xc7', '\xc8', '\xc9', '\xca', '\xcb', '\xcc', '\xcd', '\xce', '\xcf', '\xd0', '\xd1', '\xd2', '\xd3', '\xd4', '\xd5', '\xd6', '\xd7', '\xd8', '\xd9', '\xda', '\xdb', '\xdc', '\xdd', '\xde', '\xdf', '\xe0', '\xe1', '\xe2', '\xe3', '\xe4', '\xe5', '\xe6', '\xe7', '\xe8', '\xe9', '\xea', '\xeb', '\xec', '\xed', '\xee', '\xef', '\xf0', '\xf1', '\xf2', '\xf3', '\xf4', '\xf5', '\xf6', '\xf7', '\xf8', '\xf9', '\xfa', '\xfb', '\xfc', '\xfd', '\xfe', '\xff']

This is nearly all of the invalid bytes (only '\n' and ':' are handled correctly).

Of particular note is that '\x00', '\r', '\t', and ' ' are allowed through.

I suggest that we start rejecting headers wtih names that contain any of these characters, as most web servers do.

Header values

RFC 9110 says this:

Field values containing CR, LF, or NUL characters are invalid and dangerous, due to the varying ways that implementations might parse and interpret those characters; a recipient of CR, LF, or NUL within a field value MUST either reject the message or replace each of those characters with SP before further processing or forwarding of that message.

Tornado allows null bytes in header values.

I suggest that we start rejecting headers with values that contain null, as most web servers do.

@kenballus kenballus changed the title Incorrect header name validation Incorrect header validation Aug 12, 2023
@bdarnell
Copy link
Member

Tornado is definitely a product of the View Source era of web development and Postel's law - there are probably a lot of places like this where we're more permissive than the standards. I'm inclined to be cautious about becoming less permissive (unless there's a concrete risk as with underscores in Content-Length), though, since there's always the possibility that someone is relying on the old behavior.

The way the set of allowed header characters was documented changed between RFC 2616 and 7230. 2616 said "anything but control and separators", while 7230 listed the allowed characters. I don't think that's a semantic change (if so it wasn't noted in the "differences from 2616" section), but the new form is much clearer. In particular, it was less obvious that only ascii was allowed (I was under the impression that ISO-8859-1 was used), leading to confusion in #2043. 7230 does document some changes in this area, in particular forbidding leading/trailing whitespace.

I agree with your suggestions; nuls and whitespace seem like the most problematic characters here and the least likely to cause backwards-compatibility problems. I'm also inclined to forbid all non-ascii characters. But I'd probably queue these changes up for Tornado 7.0 instead of putting them in a minor release.

@kenballus
Copy link
Author

Tornado is definitely a product of the View Source era of web development and Postel's law - there are probably a lot of places like this where we're more permissive than the standards. I'm inclined to be cautious about becoming less permissive (unless there's a concrete risk as with underscores in Content-Length), though, since there's always the possibility that someone is relying on the old behavior.

This is understandable. I think most users expect a pretty strict interpretation of the standards these days, so it might be worth keeping a list of those places in which Tornado is more permissive than the standards recommend.

I agree with your suggestions; nuls and whitespace seem like the most problematic characters here and the least likely to cause backwards-compatibility problems. I'm also inclined to forbid all non-ascii characters. But I'd probably queue these changes up for Tornado 7.0 instead of putting them in a minor release.

I think it makes sense to delay changing behavior on non-ascii headers until 7.0, but nul could potentially be handled in a minor release, given that nul bytes aren't in any non-null UTF-8 characters.

@kenballus
Copy link
Author

Another thing: the RFC specifies that header names must be non-empty, but Tornado doesn't enforce this rule. Some proxies have had issues with empty header names in the past (CVE-2023-25725)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants