Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse charset and decode text #11

Closed
wants to merge 4 commits into from

Conversation

orsinium
Copy link

@orsinium orsinium commented Oct 8, 2019

close #3
close #6


@CachedProperty
def text(self):
if not self.media_type.startswith('text/'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intuitively i'd say utf-8 is a saner and more useful default. or does the spec explicitly forbid that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is:

If one is not specified, the media type of the data URI is assumed to be text/plain;charset=US-ASCIIQ

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok cool ascii it is then

@@ -6,3 +6,6 @@ deps=-rrequirements-test.txt
commands=
pytest --cov {envsitepackagesdir}/datauri {posargs} tests/
flake8 datauri/

[flake8]
max-line-length = 90
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems unrelated but ok

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not related to the issue, but with 79 chars I can't even put the link on the source of CachedProperty without splitting it. That's silly.

@@ -22,6 +22,18 @@ class DataURIError(ValueError):
pass


# https://github.com/bottlepy/bottle/commit/fa7733e075da0d790d809aa3d2f53071897e6f76
class CachedProperty(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cached_property (or an alias cached_property = CachedProperty) makes usage look more like @property for which this is a drop-in replacement anyway

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 👍

@wbolster
Copy link
Collaborator

wbolster commented Oct 9, 2019

in general, looks good. changelog entry and docs + example in README would be needed as well

@orsinium
Copy link
Author

orsinium commented Oct 9, 2019

changelog entry and docs + example

Oh, right. Sorry. Fixed 👍

Copy link
Collaborator

@wbolster wbolster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. added a remark about dealing with unknown encodings, other than that, all fine! 🚀

def text(self):
if not self.media_type.startswith('text/'):
return None
return self.data.decode(self.charset or 'ascii')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will raise LookupError in case the encoding is a weird name like foo. should we error out or return None in that case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, something should be raised to allow lib users to make the decision about such cases themself.

@orsinium orsinium closed this May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parse and expose "mediatype" if present add support for text decoding
2 participants