fix: update branch offsets before evaluating compiled function frame size #1699
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #1698
cc @tjjfvi Thanks again for your detailed bug report with which I was able to isolate the issue.
After finding the bug it was clear to me why fuzzing couldn't find it and why it is fixed on the new branch.
Why didn't fuzzing find it?
How I found it: I scanned through the meta-information of compiled functions of your
repro.wasmfile and found a single HUGE function that had more function local constants than its frame size permits.Fuzzing couldn't find the issue since it only happens for extremely large Wasm functions (>2^15 instructions) where a specific fallback encoding for conditional branch instructions is required. However, fuzzing will never generate Wasm files as huge as needed for this to be found.
What caused the bug?
The bug was that the frame size of a compiled function was computed prior to resolving all the branch offsets. However, for very large functions updating the branch offsets might allocate new function local constants which affect frame sizes.
The new branch removes the fallback encoding for conditional branches and it also removes function local constants - so kinda a double fix.
Usually, I'd add a regression test, however, given that a test testing this would be rather slow and given that this bug cannot occur in this form due to the change in interpreter architecture, I think it is okay to just have the elaborate comment directly in the code.