-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression between old and new version for loading parquet #15
Comments
Aha! Yes please do send me a sample. I thought that code path wasn't possible. |
Hey @Shogan ping on a sample to help me reproduce this :/ |
What I'm going to do in the meantime is drop this panic. It's demonstrating a real bug but maybe you don't care about this particular column. Instead it will just log some info about the column and you still won't be able to query that column until I fix the bug. multiprocessio/datastation#162 this pr is where the main fix happens. |
Just started updating 0.6.0 for the AUR, and I'm getting a potential regression that may be related to this. For reference, the version I am updating from (0.5.0) passes all tests successfully. Here's test output from
|
Rats! No I don't think it's related. I redid the way JSON encoding/decoding works so it's not surprising there's a bug. But it is surprising it's in of the files that are tested in automated testing! |
I'm having trouble reproducing this though. In Github Actions this test passes, as does it on my MBP and Fedora Linux dev machine. I also tried building dsq and running the tests in an Can you tell me any more about your machine/environment? I'm surprised it worked for you before and now breaks. |
Hi @eatonphil , sorry it took me so long - I was on vacation for a while and not checking notifications. I've had a look at the parquet I was querying and unfortunately I can't provide it here easily as it has sensitive data. However in trying to load it up and edit it to remove said data, using viewer tool I got an exception about INT96 being unsupported. So this parquet data I'm working with uses INT96. Could that be an issue? I believe it is incompatible with avro. I used the same version of dsq that I used when I started this thread to load some other parquet file examples I found here and it worked fine for these. |
Ok I'll try making a dataset with INT96 in it and see if that causes an issue. But also if you don't need that column right now newer versions of dsq won't crash when this happens. They'll just not be able to load that column for querying. |
Nice! Confirmed version 0.6.0 works. Thanks @eatonphil 🎉 |
I can reproduce it on two different machines both running Arch Linux, both Intel & AMD CPUs with go 1.18. Basically just running in a clean chroot (based on systemd-nspawn) following our Go packaging guidelines, so the flags could be a thing (but I doubt it). I've currently skipped the failing parquet tests. |
Gotcha! It's the -buildmode=pie flag that is exposing this crash (I don't know whether or not to say it's causing the crash). I'm going to make a separate issue about -buildmode=pie. I don't know whether I'll be able to fix it though. |
Previously with an older release of dsq I could do a basic SQL select on a parquet file.
With the latest release (
0.2.0
), I get this error:panic: Missing type equality condition for unknown merge.
Command:
If you can't reproduce this easily I can see if I can get a sample parquet file together and attached to this.
Stacktrace:
The text was updated successfully, but these errors were encountered: