Skip to content

Comments

Document limitation of parsing body fragments only, rename processFragment method#4

Merged
mpdude merged 3 commits intomainfrom
restrict-to-body-fragment
Jan 26, 2026
Merged

Document limitation of parsing body fragments only, rename processFragment method#4
mpdude merged 3 commits intomainfrom
restrict-to-body-fragment

Conversation

@mpdude
Copy link
Member

@mpdude mpdude commented Jan 26, 2026

The HTML5 specification defines different parser states and tokenization modes depending on where content appears in an HTML document. For example, content inside <textarea> or <title> tags is parsed differently than content in the <body>. The PHP DOM API for HTML5 parsing doesn't currently expose a documented way to create fragments with the necessary context information to handle all these different parsing states correctly.

So, this PR renames the processFragment() method to processBodyFragment() to make the limitation explicit in the API itself. Also, the fragment parsing implementation is changed from a container-element based approach to using the body element directly, to make use of the "in body" parsing state. A cautionary notice is added in the README.

The previous implementation could produce incorrect results for fragments from <head> sections (like <title> tags) or other contexts where different parsing rules apply, or for fragments that contain <head> and/or <body> tags themselves.

Skipped tests demonstrate the issue, hopefully making it easier to fix properly when PHP's DOM API adds support for fragment context.

For now, explicitly limiting usage to the <body> seems to be a safer choice and easier to change in the future than having potentially quirky detection logic in the implementation that would try to handle different cases (from head and/or body) correctly.

@mpdude mpdude merged commit d692601 into main Jan 26, 2026
4 checks passed
@mpdude mpdude deleted the restrict-to-body-fragment branch January 26, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant