-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Expose memory tracking API to Array and ArrayData #8040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
9ef341f to
4fa363c
Compare
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
31b8c72 to
27f8c46
Compare
|
Thanks @waynexia -- is there a high level description / "[EPIC]" style ticket of what we are doing with these APIs? I think it would be good to make an issue that describes the high level project / direction for wider visibility (and likely to get other people involved) I found some related issues. Do any of them reflect your plan? |
| use arrow_schema::{DataType, Field}; | ||
| use std::sync::Arc; | ||
|
|
||
| fn main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we write this as a (doc?)test instead (or as well)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please -- I think it would be much easier to find as an doc test -- perhaps you could just move it to Array::claim
Sounds like a great idea -- thank you @waynexia and @Dandandan I am quite backed up reviewing other projects and PRs , but will try and review this one over the next few days. Any help reviewing would be most apprecaited. I filed a epic ticket to track this project, and maybe we can use that to help organize our work a bit better |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @waynexia -- I think the code makes a lot of sense and we could merge it as is.
However, I feel strongly that without some additional documentation on how to use claim downstream projects are not likely to be able to take advantage of this new API
It do think it would be ok to update the docs as a follow on PR, especially if you wanted to get this PR into the next release (I plan to make a RC in the next few days)
| /// let pool = TrackingMemoryPool::default(); | ||
| /// | ||
| /// // Claim the array's memory in the pool | ||
| /// array.claim(&pool); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add an example (either here or elsewhere) of how one would use claim?
For example, if we now did
let array2 = array1.slice(0, 1);Is the idea that now array2.array_memory_size() would be zero?
| slice2.claim(&pool); | ||
| let final_usage = pool.used(); | ||
|
|
||
| println!("After claiming 2 slices: {final_usage} bytes"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these should actually test the bytes used (not just print them out)
| use arrow_schema::{DataType, Field}; | ||
| use std::sync::Arc; | ||
|
|
||
| fn main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please -- I think it would be much easier to find as an doc test -- perhaps you could just move it to Array::claim
|
@waynexia I am preparing for another arrow release, hopefully I'll make the RC tomorrow. Would you like to
Just let me know |
|
Is it being currently worked upon? I can help in moving this PR forward @waynexia @Dandandan @alamb |
New `claim` API for NullBuffer, ArrayData, and Array. New `pool` feature-flag to arrow, arrow-array and arrow-data. Part of apache#8137. Replaces apache#8040.
|
👋 I've reheated this PR over at feat(memory-tracking): expose API to NullBuffer, ArrayData, and Array. Let me know what you think! :) |
Which issue does this PR close?
We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.
Rationale for this change
#7303 implements the fundamental symbols for tracking memory. This patch exposes those APIs to a higher level
ArrayandArrayData.What changes are included in this PR?
New APIs
claimforArrayandArrayData. New featurepooltoarrow,arrow-arrayandarrow-datafor the new APIAre these changes tested?
Yes, and there is an example to demo basic usage.
Are there any user-facing changes?
New API and feature