-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Description
This is my weekly plan, mostly for my own organizational needs (as I am dropping too many things). I am making it public in the hopes that helps others to see what I am working on -- also I spend so much time in github the interface is very familiar to me and I can cross link all the issues I am working
(it is also my excuse as to why I haven't reviewed many good looking PRs)
Notes to myself: a duplicate entry unchecked means I need to go back and re-review
PR review queue (rough order)
- feat: support invoking table functions with tables rather than a single expression #18535
- fix: pre-warm listing file statistics cache during listing table creation #18971
- Support simplify not for physical expr #18970
- Row group limit pruning #18868 (rereview)
- feat: integrate batch coalescer with repartition exec #19002
- Cut Parquet over to PhysicalExprAdapter from SchemaAdapter #18998 (projection pushdown)
- common: Add hashing support for REE arrays #18981
- Support reverse parquet scan and fast parquet order inversion at row group level #18817
- Adds memory-bound DefaultListFilesCache #18855
- fix: try merging dictionary as a fallback on overflow error arrow-rs#8652
- common: Add hashing support for REE arrays #18981
- add specialized InList implementations for common scalar types #18832
- Make Parquet SBBF serialize/deserialize helpers public for external reuse arrow-rs#8762
- Emit aggregation groups in chunks to avoid blocking async runtime #18906
- Add support for
Uniontypes inRowConverterarrow-rs#8839 - Add union to opaque comparisons arrow-rs#8896
- Add validated constructors for UnionFields arrow-rs#8891
- feat(memory-tracking): expose API to NullBuffer, ArrayData, and Array arrow-rs#8918 (comment)
- [DRAFT] Extension Type Registry Draft #18552
- feat: implement GroupArrayAggAccumulator attempt 3 #17915
- Add relation planner extension support #17843
- Enable Parallel Aggregation for Non-Overlapping Partitioned Data #18826
Projects I am supporting actively (high on my priority list)
- Improve DataFusion ClickBench performance: [EPIC] Make DataFusion the top of the ClickBench Parquet leaderboard #18489
- Release object store 0.13.0: Release object store
0.13.0(breaking) - Target Nov 2025 arrow-rs-object-store#367 - DataFusion object store requests go faster with @BlakeOrth [EPIC] ListingTable object store usage improvements #17214
- Help integrate Variant with @friendlymatthew [EPIC] Support
VARIANTtype for unstructured data #16116
Projects on my backlog
These are ones I would like to support but don't have the capacity at the moment to push, in relative order
PRs that look great but need a thorough review (looking for help here 🎣 from anyone else)
- Examples of extending SQL syntax #17824 from the @theirix
- external tables for multiple locations: feat(cli): support external tables on multiple locations #17702
- relation extension planner: Add relation planner extension support #17843
- writing REE arrays to parquet: Support writing RunEndEncoded as Parquet arrow-rs#8069
Metadata
Metadata
Assignees
Labels
No labels