-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Support for Multi-Level Partition Tables #115
test: Support for Multi-Level Partition Tables #115
Conversation
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
a1ae9c3
to
4d664ed
Compare
Hi @rebasedming, I'd like to provide an update on the progress of the investigation:
Could you please provide feedback on the approach used for handling multi-level partition table queries in the execution hook? Additionally, I would appreciate any suggestions on how it can be further refined. |
Hi @shamb0 - I admire your perseverance chasing down this issue! It's a tricky one. I know this is just a draft, but I don't feel good about this implementation. Intercepting and rewriting the query feels pretty unsafe and introduces a lot of complexity to the code base. I'm open to being convinced but I think this is way too much technical overhead for just this one feature. |
Hi @rebasedming, Thank you for your thorough review and candid feedback on the PR code patch. I truly appreciate the time you've taken and your openness to further discussion. Regarding your concerns:
I'm looking forward to our continued collaboration on this. |
Hi @rebasedming,
Thank you for your guidance throughout this process. I appreciate your support.
|
* fixed date functions support * fixing lint & more testing * fix & remove unecessary conversion based on PR comments * date trunc test * remove unnecessary code --------- Co-authored-by: Evance Soumaoro <evanxg852000@gmail.com> Co-authored-by: Ming Ying <ming.ying.nyc@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
Signed-off-by: shamb0 <r.raajey@gmail.com>
bdf1997
to
0b830ba
Compare
Signed-off-by: shamb0 <r.raajey@gmail.com>
* fixed json & jsonb cast support * fixing test * fixing test * adding a better test * refactoring tests * bug fixes * json tests passing * remove debug * Increase the runner size Signed-off-by: Philippe Noël <21990816+philippemnoel@users.noreply.github.com> --------- Signed-off-by: Philippe Noël <21990816+philippemnoel@users.noreply.github.com> Co-authored-by: Vipul Vaibhaw <vaibhaw.vipul@gmail.com> Co-authored-by: Ming Ying <ming.ying.nyc@gmail.com> Co-authored-by: Philippe Noël <21990816+philippemnoel@users.noreply.github.com>
* Rm cargo clean * Remove unnecessary test Signed-off-by: Philippe Noël <21990816+philippemnoel@users.noreply.github.com> --------- Signed-off-by: Philippe Noël <21990816+philippemnoel@users.noreply.github.com>
…vel-partition-table-dset-auto-sales
Signed-off-by: shamb0 <r.raajey@gmail.com>
Hi @philippemnoel, I have resolved the code conflicts, and the branch should now be ready for intake review and potential merge. :) |
Thank you! I’m personally very excited for this iteration, it feels much better. We’re pretty busy with a few big features right now, but we’ll review ASAP and I’m hopeful we can get this merged next week. @rebasedming is the final authority on any merge, so whenever he has some time he’ll take a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @shamb0 👍 I left some minors for your consideration.
…vel-partition-table-dset-auto-sales
@Weijun-H, thanks for your feedback! I’ve addressed all your comments. Let me know if there’s anything else to improve. |
Signed-off-by: shamb0 <r.raajey@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
@@ -0,0 +1,98 @@ | |||
// Copyright (c) 2023-2024 Retake, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name test theses a bit better? Perhaps something like test_hive_partitioning
or something like that?
@@ -0,0 +1,614 @@ | |||
// Copyright (c) 2023-2024 Retake, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is huge. Do we really need an entire new fixture or can we reuse our existing ones with nyc_trips
? If not, is there any way to shorten this file?
Overall I'm quite happy with this PR, but I just find it to introduce a lot of new testing utilities. Some of them are also built in this file, but don't seem to be actual tables. These utilities should probably be moved to a separate file in tests/fixtures/
. Perhaps something like parquet.rs
for the batch writing, or s3.rs
or something like that.
@Weijun-H Would love your help thoroughly reviewing this PR. Ming is busy with search-related work.
Once we clean up this file, I am happy to approve and merge this. Thank you for your patience @shamb0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tl;dr:
- If we really need a full separate table, let's keep only that in this file and move the utilities to a properly scoped. file
- If we don't need a full separate table, let's use the NYC trips test fixture
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested the above change^ Once that's done/answered, I am willing to approve this so long as @Weijun-H is also onboard.
Hi @philippemnoel, Thank you for the review comments, I really appreciate it. Moving forward, I'll aim for smaller PRs with minimal changes to make them easier to review. |
Closes #56
What
Implements a demonstration test for multi-level partition tables, addressing issue #56.
Why
This demonstration showcases the
pg_analytics
extension's ability to support multi-level partitioned tables. By using Hive-style partitioning for organizing data hierarchically, it enables efficient access to context-specific information, enhancing query performance and scalability.How
The implementation involves two key components:
Hive-style Partitioning in S3
FOREIGN TABLE Configuration in
pg_analytics
FOREIGN TABLE
with options to access the Hive-style partitioned dataset:Tests
To run the demonstration test, use the following command:
RUST_LOG=info \ cargo test \ --test \ test_mlp_auto_sales \ -- \ --nocapture