-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parseHTML()
unsuitable for XML/XHTML with self closing tags: textContent()
returns next siblings' textContent
#2644
Comments
Thanks for opening this issue, though I am not sure this is a bug, exactly 🤔 It seems pretty clear from the code that this is actually the intended behavior: k6/js/modules/k6/html/element.go Lines 222 to 224 in 5b70e70
k6/vendor/github.com/PuerkitoBio/goquery/property.go Lines 60 to 83 in 5b70e70
And it seems that it roughly matches the Web API it is attempting to mimic, which also returns the text of the descendants: https://developer.mozilla.org/en-US/docs/Web/API/Node/textContent So, yeah, this seems like it's working as intended. What is your use case and why is this behavior a problem for you? |
h1 and p are not descendants of the div, div is self-closed as already mentioned, it does not have any descendants. |
Weird, so browser according to web API also wraps them inside... |
Ok, sorry for disturbing, issue should be closed then, |
Ah, sorry, I apparently missed the "self closing" part of the title 🤦♂️ ☕ I guess this might also be reason enough to add explicit XML-parsing API in k6 as well 🤔 In the past, we've encouraged people to abuse |
I'll reopen this and adjust the title a bit. While k6 might follow browser conventions, I think this is something worth fixing, probably by having a proper XML parsing API. Which also might be needed for other purposes, e.g. #1539. Or, at the very least, we should document this limitation of FWIW, |
parseHTML()
unsuitable for XML/XHTML with self closing tags: textContent() returns next siblings' textContent
parseHTML()
unsuitable for XML/XHTML with self closing tags: textContent() returns next siblings' textContent
parseHTML()
unsuitable for XML/XHTML with self closing tags: textContent()
returns next siblings' textContent
As pointed out in the previous comment, this is a specific limit of the Based on the lack of reactions, we don't plan to address this specific issue in k6. In the future, we might add a better html implementation if there is a need. Probably, in an extension more than a native module in k6 core. These days the browser module exists and it might be an alternative for some of the cases originally covered by the HTML module. |
Brief summary
see steps to reproduce
k6 version
0.39.0
OS
Windows 10
Docker version and image (if applicable)
No response
Steps to reproduce the problem
Expected behaviour
ERRO[0000] {"value":"123","text":""} source=console
Actual behaviour
ERRO[0000] {"value":"123","text":"My HeadingMy paragraph."} source=console
The text was updated successfully, but these errors were encountered: