This repository was archived by the owner on Feb 18, 2024. It is now read-only.
How to go from original data in Vec<T> to arrow (and parquet)? #1551
Unanswered
nbigaouette
asked this question in
Q&A
Replies: 1 comment
-
Hi, I have the same question, particularly about |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to wrap my head around Parquet and Arrow. What I'd like to have is a way to serialize Rust data to a binary on disk format. I though parquet could be this binary format but I'm having trouble trying to properly do this serialization.
My Rust data can be described as a group of two (or more) vectors. For example:
Most of the time, I have a single
values
vector, so I can simplify to:And since memory is managed elsewhere, the following would be ideal to serialize without copy:
Now everywhere I look it seems that the serialization expects something like a "row" instead of a "column". This means I need to create a struct like that:
Since I already have my
Vec
in memory, do I really need to "chunk" some rows together before saving to parquet? Can't I just point to the already allocated memory and say "here are two slices, save them to disk"?Beta Was this translation helpful? Give feedback.
All reactions