Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Netflix / x-element Public

Notifications You must be signed in to change notification settings
Fork 14
Star 31

Code
Issues 21
Pull requests 6
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Introduce “UnforgivingHtml” parser. #231

Open

theengineear wants to merge 1 commit into main

base: main

Choose a base branch

Loading

Loading

from unforgiving-html

Open

Introduce “UnforgivingHtml” parser. #231

theengineear wants to merge 1 commit into main from unforgiving-html

Conversation 24 Commits 1 Checks 1 Files changed

Conversation

Copy link

Collaborator

theengineear commented Dec 5, 2024 •

edited

Loading

Goals of the parser:

Tighten control over things like double-quotes & closing tags.
Improve error messaging for malformed markup.
Improve performance.

TODOs / Questions:

Do we need to strictly reject CDATA in math and svg namespaces? https://w3c.github.io/html-reference/syntax.html#cdata-sections / https://developer.mozilla.org/en-US/docs/Web/API/CDATASection. Note, it appears that when attempting to add CDATA to html… it is converted to an html comment. Otherwise, in svg and math — it will work as expected. Pass, we can introduce new “forbidden” parsing states to throw an error here. It was already failing due to the fact that it wouldn’t match — but we now add a special error-time lookup to throw a helpful error.
Some of the validations around tag name, attribute name, and property name may be slow… should we be less strict? Is it the case that the browser would throw on any of these anyhow? Could we just let that happen? Punting this problem to later — let’s start strict.
I think HTML is often assumed to be UTF-8, but JS strings (i think) are sometimes treated via UTF-16. That means we can accept UTF-16 characters and inject them as text content. But, should we reject that? I would prefer to go as long as possible without concerning character encodings. If / when we can think of a way this could break (i.e., differing character encodings in text / files) — we could always revisit.
~~Is there a way to test UTF-8 multi-byte sequence interop with multi-by UTF-16 surrogate pair interop. Maybe not a problem, but it could be important to know one way or the other!~~ I think if you really blow it, you will get some sort of error. See isWellFormed for some context.
~~Do we need special handling of NUL character — U+0000? It is forbidden in text / character data. What about “permanently undefined Unicode characters”?~~ YAGNI.
Can text nodes be added in svg / math namespaces? yes
~~Should we use setAttributeNS when in a namespace?~~ I believe this would only be required if there was a prefix associated with an attribute name like spec:align… since we would reject that attribute name anyhow, I think we already can’t get into this situation. Going to not bother with this one.
Do we need special consideration around optional newline for pre / textarea — https://html.spec.whatwg.org/multipage/syntax.html#element-restrictions. Indeed, if you set like document.body.innerHTML = '<pre>\nhi</pre> — that initial newline will not appear. Yuck. Yes. Fixed. (still “yuck” though!)
~~Are there any more maps we ought to turn into more-performant switches?~~ This is a “later” problem.
~~Seems like there may be some conventions to use camelCase element names in SVG. Are we okay forcing them to be lowercase?~~ Might as well start strict.

Sorry, something went wrong.

All reactions

theengineear force-pushed the unforgiving-html branch 3 times, most recently from 3bcbba0 to 295b4ed Compare

December 11, 2024 17:00

theengineear force-pushed the unforgiving-html branch from 295b4ed to a964a2b Compare

December 16, 2024 18:34

theengineear marked this pull request as draft

December 17, 2024 01:40

theengineear force-pushed the unforgiving-html branch 2 times, most recently from e5c6421 to 6ac7000 Compare

December 18, 2024 21:30

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

klebba reviewed

View reviewed changes

x-template.js Outdated Show resolved Hide resolved

theengineear force-pushed the unforgiving-html branch 4 times, most recently from a3ed7d8 to 4ca7863 Compare

December 20, 2024 00:15

theengineear commented

View reviewed changes

x-template.js

		}
		// Again, there might be a quote we need to slice off here still.

Copy link

Collaborator Author

theengineear Dec 20, 2024

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned — removing that Forgiving comment has made the git diff super wonky here. Let me know if you want that back in 👌

Sorry, something went wrong.

All reactions

theengineear marked this pull request as ready for review

December 20, 2024 00:20

theengineear requested a review from klebba

December 20, 2024 00:21

theengineear force-pushed the unforgiving-html branch 2 times, most recently from 3bfc3cf to 9874ee3 Compare

December 20, 2024 01:18

Copy link

Collaborator Author

theengineear commented Dec 20, 2024

OK @klebba — Certainly, there are still bugs remaining in here, but I believe this should be a reasonable first-pass at the new “unforgiving” parser. I’m going to flip my attention to integration testing this in code we’ve authored to start confirming parsing behavior.

All reactions

Sorry, something went wrong.

theengineear force-pushed the unforgiving-html branch 3 times, most recently from ae0ea29 to 72c5137 Compare

December 23, 2024 15:48

theengineear force-pushed the unforgiving-html branch 2 times, most recently from 650a871 to 42257d6 Compare

December 27, 2024 23:15


          Introduce “UnforgivingHtml” parser.

72ade95

Goals of the parser:
* Tighten control over things like double-quotes & closing tags.
* Improve error messaging for malformed markup.
* Improve performance.

Closes #239.

theengineear force-pushed the unforgiving-html branch from 42257d6 to 72ade95 Compare

December 28, 2024 01:57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

klebba Awaiting requested review from klebba

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.