Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect positions might be recorded after ORC-1711 #2037

Closed
nickitat opened this issue Sep 23, 2024 · 3 comments
Closed

Incorrect positions might be recorded after ORC-1711 #2037

nickitat opened this issue Sep 23, 2024 · 3 comments

Comments

@nickitat
Copy link

nickitat commented Sep 23, 2024

Hi 👋🏼
It seems we stepped on a bug after updating ORC submodule on 360c061.

In the corresponding PR we start seeing test failure.
The failing test: select count(), sum(number) from file('02892.orc') where indexHint(sometimes_null is NULL);.
The exception: orc::ParseError, e.what() = Bad seek to (chunkHeader=0, posInChunk=65536) in zstd(ORCInputStream from 1401 for 12)

While looking at this test I noticed that we indeed read wrong positions here:

orc/c++/src/Reader.cc

Lines 457 to 462 in f919fab

if (pbStream.kind() == proto::Stream_Kind_ROW_INDEX) {
proto::RowIndex rowIndex;
if (!rowIndex.ParseFromZeroCopyStream(inStream.get())) {
throw ParseError("Failed to parse the row index");
}
rowIndexes_[colId] = rowIndex;

And then I found that we actually also write incorrect values here:

orc/c++/src/ByteRLE.cc

Lines 190 to 194 in f919fab

if (outputStream->isCompressed()) {
// start of the compression chunk in the stream
recorder->add(flushedSize);
// number of decompressed bytes that need to be consumed
recorder->add(unflushedSize);

And then I found that the difference between previously written values and the values we're getting after the update is only that in the positions when we previously had zeroes we now have 65536. And this number is equal to MemoryBlockSize in the PR. It happens when all three unusedBufferSize, bufferLength, bufferPosition were equal to 0.

I'm not familiar with the code, so all I did is just implemented a small crutch to work around that issue: ClickHouse@e77893c and at least it fixed our tests.
Now I'd like to ask the actual experts in this code to have a look and implement a proper fix.

Thanks!

@nickitat
Copy link
Author

cc @luffy-zh @ffacs @wgtmac

@luffy-zh
Copy link
Contributor

Thanks for the report, I'll take a look.

wgtmac added a commit that referenced this issue Oct 11, 2024
### What changes were proposed in this pull request?
Make sure the recorded position in blockCompressStream is correct.

### Why are the changes needed?
Fix the issue discussed [here](#2037).

### How was this patch tested?
UTs in testwriter.cc can cover this patch.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #2038 from luffy-zh/ORC-1711.

Lead-authored-by: luffy-zh <zhnice@outlook.com>
Co-authored-by: Gang Wu <ustcwg@gmail.com>
Signed-off-by: Gang Wu <ustcwg@gmail.com>
@nickitat
Copy link
Author

i checked on our tests and everything looks fine now - big thanks @luffy-zh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants