-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[io] Extract tkey walk logic from TFile::Map() #17575
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea! In this approach, all keys are loaded in memory before printing the information. Do we need a piecewise / cursor based API?
This is possible, although a file containing 1 million keys would only occupy 68 MiB of memory for the TKeyMapNodes (for reference, the 3.8 GB ttjet_13tev benchmark dataset has 278470 keys - about 38 MiB of memory). This is not counting the classname/keyname/key title strings; with them the figure is likely doubled or so. |
0491d72
to
45adcda
Compare
The current observed maximum number of baskets in TTree is 50 millions baskets ... and only because it reaches the 1Gb limit for the TTree object. It will/can grow larger once we lift the 1Gb limit and can already reach larger size with RNTuple (probably not quite as easily due to page size being larger than basket sizes). Nonetheless that is 3.1 GiB of memory for the TKeyMapNodes .... so indeed I would recommend some sort of iterators mechanism (other-wise the code simply 'crash/out-of-memory' for large files. |
Test Results 18 files 18 suites 4d 7h 5m 39s ⏱️ Results for commit 2bcccda. ♻️ This comment has been updated with latest results. |
ef10a54
to
cde8d52
Compare
cde8d52
to
2bcccda
Compare
@pcanal is there anything else blocking this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
This Pull request:
refactors
TFile::Map
into 2 methods:Map
andWalkTKeys
. The latter contains the logic of traversing the TKeys in the file and returns an array with information about keys, gaps and errors.Map
now simply calls that method and prints out the relevant information, in the same format as before.The main advantage of splitting
WalkTKeys
is that it can be used by other places (like unit tests or client code) that are interested in the internal TKey structure.Checklist: