-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect positions might be recorded after ORC-1711 #2037
Comments
Thanks for the report, I'll take a look. |
Merged
wgtmac
added a commit
that referenced
this issue
Oct 11, 2024
### What changes were proposed in this pull request? Make sure the recorded position in blockCompressStream is correct. ### Why are the changes needed? Fix the issue discussed [here](#2037). ### How was this patch tested? UTs in testwriter.cc can cover this patch. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #2038 from luffy-zh/ORC-1711. Lead-authored-by: luffy-zh <zhnice@outlook.com> Co-authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Gang Wu <ustcwg@gmail.com>
i checked on our tests and everything looks fine now - big thanks @luffy-zh |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi 👋🏼
It seems we stepped on a bug after updating ORC submodule on 360c061.
In the corresponding PR we start seeing test failure.
The failing test:
select count(), sum(number) from file('02892.orc') where indexHint(sometimes_null is NULL);
.The exception:
orc::ParseError, e.what() = Bad seek to (chunkHeader=0, posInChunk=65536) in zstd(ORCInputStream from 1401 for 12)
While looking at this test I noticed that we indeed read wrong positions here:
orc/c++/src/Reader.cc
Lines 457 to 462 in f919fab
And then I found that we actually also write incorrect values here:
orc/c++/src/ByteRLE.cc
Lines 190 to 194 in f919fab
And then I found that the difference between previously written values and the values we're getting after the update is only that in the positions when we previously had zeroes we now have 65536. And this number is equal to MemoryBlockSize in the PR. It happens when all three
unusedBufferSize, bufferLength, bufferPosition
were equal to 0.I'm not familiar with the code, so all I did is just implemented a small crutch to work around that issue: ClickHouse@e77893c and at least it fixed our tests.
Now I'd like to ask the actual experts in this code to have a look and implement a proper fix.
Thanks!
The text was updated successfully, but these errors were encountered: