Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floki is less lenient with nested comments than browsers #612

Open
wmnnd opened this issue Feb 16, 2025 · 1 comment
Open

Floki is less lenient with nested comments than browsers #612

wmnnd opened this issue Feb 16, 2025 · 1 comment
Labels

Comments

@wmnnd
Copy link

wmnnd commented Feb 16, 2025

HTML doesn’t allow nested comments. However, both Firefox and Chromium are somewhat lenient about that which can result in surprising issues when you parse a document with Floki (I tried this with 0.37.0):

raw_html = """
<!doctype html>
<body>
Before the comment<br>

<!--[if mso | IE]>
  <div>
    <!-- this is a nested comment -->
  </div>
<![endif]-->

After the comment.
</body>
"""
parsed_html = raw_html |> Floki.parse_document!() |> Floki.raw_html()

File.write!("raw-html.html", raw_html)
File.write!("parsed-html.html", parsed_html)

raw-html.html looks exactly like the original string:

<!doctype html>
<body>
Before the comment<br>

<!--[if mso | IE]>
  <div>
    <!-- this is a nested comment -->
  </div>
<![endif]-->

After the comment.
</body>

But parsed-html.html looks like this:

<body>
Before the comment<br/><!--[if mso | IE]>
  <div>
    <!-- this is a nested comment -->
&lt;![endif]--&gt;

After the comment.
</body>

Floki escapes the > of the outer comment to &gt;. And because browsers are lenient when handling nested comments, this changes the way this file is displayed:

Image

Image

I’m not sure if this could be considered a bug but I did find it somewhat unexpected.

@wmnnd
Copy link
Author

wmnnd commented Feb 16, 2025

It looks like other browsers behave like this specifically for Conditional Comments for IE:

A regular non-conditional comment is rendered like this:

<!doctype html>
<body>
Before the comment<br>

<!-- this is just a random comment
  <div>
    <!-- this is a nested comment -->
  </div>
this is the end of the random comment -->

After the comment.
</body>

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant