Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect Content-Type header when encoding unicode bodies #422

Closed
kgriffs opened this issue Jan 27, 2015 · 3 comments
Closed

Respect Content-Type header when encoding unicode bodies #422

kgriffs opened this issue Jan 27, 2015 · 3 comments

Comments

@kgriffs
Copy link
Member

kgriffs commented Jan 27, 2015

Currently, if req.body is set to a unicode string, the framework always encodes it as UTF-8 in the response. For correctness, we should respect alternate encodings when they are specified in the Content-Type header for the response.

@lwcolton
Copy link
Contributor

Can you explain this a little more? Would this just be if the content-type is set to text/, then charset is used to decide how to encode the charset? http://www.w3.org/Protocols/rfc1341/7_1_Text.html

@kgriffs
Copy link
Member Author

kgriffs commented May 5, 2016

According to https://tools.ietf.org/html/rfc2046#section-4.1.2 :

Other media types than subtypes of "text" might choose to employ the charset parameter as defined here

For example, RFC 7303 defines multiple charsets for the XML media type, e.g.:

Content-Type: application/xml; charset=utf-16

@kgriffs kgriffs added the bug label May 5, 2016
@kgriffs kgriffs added this to the Backlog (Non-Breaking Changes) milestone May 5, 2016
@kgriffs kgriffs modified the milestones: Backlog (Non-Breaking Changes), Triaged (Non-Breaking Changes) Apr 25, 2017
@vytas7
Copy link
Member

vytas7 commented Mar 26, 2023

I'm going to close this issue since it hasn't really attracted much attention, and it feels like it should be responsibility of the respective media handler. For instance, JSON is standardized to almost always use UTF-8 (or just ASCII by escaping the entities).

resp.text is documented to encode in UTF-8, simple and clear.
For more advanced handling of text as media, see also #2037

@vytas7 vytas7 closed this as not planned Won't fix, can't repro, duplicate, stale Mar 26, 2023
@vytas7 vytas7 added the wontfix label Mar 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants