Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content_text incorrectly takes precedence over content_html when parsing JSON Feed #492

Open
Rongronggg9 opened this issue Dec 23, 2024 · 2 comments

Comments

@Rongronggg9
Copy link
Contributor

Rongronggg9 commented Dec 23, 2024

content_text incorrectly takes precedence over content_html when parsing JSON Feed, making it impossible to get content_html if both exist.

if "content_text" in e:
entry["content"] = c = FeedParserDict()
c["value"] = e["content_text"]
c["type"] = "text"
elif "content_html" in e:
entry["content"] = c = FeedParserDict()
c["value"] = sanitize_html(
e["content_html"], self.encoding, "application/json"
)
c["type"] = "html"

According to https://www.jsonfeed.org/version/1.1/, content_text and content_html are completely equal.

content_html and content_text are each optional strings — but one or both must be present.

Note that it uses both content_text and content_html, which is completely valid. An app such as iTunes, for instance, might prefer to use content_text, while a feed reader might prefer content_html.

Thus, a better methodology to parse it may be adopting the Atom approach: making entries[i].content a dict array, i.e., [{"type": "text/plain", "value": "content"}, {"type": "text/html", "value": "<p>content</p>"}].

Such a change, admittedly, would break existing downstream projects using the develop branch. Hopefully, this won't be painful, considering JSON Feed support hasn't been released yet.

I am willing to make a PR to achieve this if you think this is feasible.

@kurtmckee
Copy link
Owner

Thanks for reporting this!

I think I may have caught this while working on a significant expansion of the JSON feed spec. It's in a branch on this repo already, but I haven't worked on that in a while.

I think I can incorporate this issue report in that branch and get that merged, but I don't have a timeline for getting that done.

@Rongronggg9
Copy link
Contributor Author

Good to know that! I looked into https://github.com/kurtmckee/feedparser/tree/expand-json-feed-support and this seems to be a great project! The changes in the mentioned branch indeed fixed the issue. Thanks for your information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants