[io] Extract tkey walk logic from TFile::Map() #17575

silverweed · 2025-01-30T11:37:02Z

This Pull request:

refactors TFile::Map into 2 methods: Map and WalkTKeys. The latter contains the logic of traversing the TKeys in the file and returns an array with information about keys, gaps and errors. Map now simply calls that method and prints out the relevant information, in the same format as before.

The main advantage of splitting WalkTKeys is that it can be used by other places (like unit tests or client code) that are interested in the internal TKey structure.

Checklist:

tested changes locally
updated the docs (if necessary)

jblomer

I like the idea! In this approach, all keys are loaded in memory before printing the information. Do we need a piecewise / cursor based API?

io/io/inc/TFile.h

silverweed · 2025-01-30T12:38:54Z

In this approach, all keys are loaded in memory before printing the information. Do we need a piecewise / cursor based API?

This is possible, although a file containing 1 million keys would only occupy 68 MiB of memory for the TKeyMapNodes (for reference, the 3.8 GB ttjet_13tev benchmark dataset has 278470 keys - about 38 MiB of memory). This is not counting the classname/keyname/key title strings; with them the figure is likely doubled or so.
Generating the array seems to be also quite fast on my machine (<1s in debug mode)

pcanal · 2025-01-30T15:11:59Z

The current observed maximum number of baskets in TTree is 50 millions baskets ... and only because it reaches the 1Gb limit for the TTree object. It will/can grow larger once we lift the 1Gb limit and can already reach larger size with RNTuple (probably not quite as easily due to page size being larger than basket sizes).

Nonetheless that is 3.1 GiB of memory for the TKeyMapNodes .... so indeed I would recommend some sort of iterators mechanism (other-wise the code simply 'crash/out-of-memory' for large files.

github-actions · 2025-01-30T18:16:33Z

Test Results

18 files 18 suites 4d 7h 5m 39s ⏱️
2 687 tests 2 687 ✅ 0 💤 0 ❌
46 670 runs 46 670 ✅ 0 💤 0 ❌

Results for commit 2bcccda.

♻️ This comment has been updated with latest results.

silverweed · 2025-02-04T14:02:28Z

@pcanal is there anything else blocking this PR?

io/io/inc/TFile.h

pcanal

LGTM. Thanks.

silverweed added the in:I/O label Jan 30, 2025

silverweed requested review from jblomer, hahnjo, dpiparo, vepadulano and enirolf January 30, 2025 11:37

silverweed self-assigned this Jan 30, 2025

silverweed requested a review from pcanal as a code owner January 30, 2025 11:37

jblomer reviewed Jan 30, 2025

View reviewed changes

io/io/inc/TFile.h Outdated Show resolved Hide resolved

silverweed mentioned this pull request Jan 30, 2025

[ntuple] RMiniFile: properly write the free slot's nbytes #17568

Merged

2 tasks

silverweed force-pushed the tfilemap_refactor branch 2 times, most recently from 0491d72 to 45adcda Compare January 30, 2025 14:31

silverweed force-pushed the tfilemap_refactor branch from ef10a54 to cde8d52 Compare January 31, 2025 09:01

silverweed added 2 commits January 31, 2025 11:14

[io] Extract tkey walk logic from TFile::Map()

d516f4b

[io] Change TFile::WalkTKeys() to return an iterable

2bcccda

silverweed force-pushed the tfilemap_refactor branch from cde8d52 to 2bcccda Compare January 31, 2025 10:14

pcanal reviewed Feb 5, 2025

View reviewed changes

io/io/inc/TFile.h Show resolved Hide resolved

pcanal approved these changes Feb 5, 2025

View reviewed changes

silverweed merged commit 1d63179 into root-project:master Feb 5, 2025
21 checks passed

silverweed deleted the tfilemap_refactor branch February 5, 2025 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[io] Extract tkey walk logic from TFile::Map() #17575

[io] Extract tkey walk logic from TFile::Map() #17575

silverweed commented Jan 30, 2025

jblomer left a comment

silverweed commented Jan 30, 2025 •

edited

Loading

pcanal commented Jan 30, 2025

github-actions bot commented Jan 30, 2025 •

edited

Loading

silverweed commented Feb 4, 2025

pcanal left a comment

[io] Extract tkey walk logic from TFile::Map() #17575

[io] Extract tkey walk logic from TFile::Map() #17575

Conversation

silverweed commented Jan 30, 2025

This Pull request:

Checklist:

jblomer left a comment

Choose a reason for hiding this comment

silverweed commented Jan 30, 2025 • edited Loading

pcanal commented Jan 30, 2025

github-actions bot commented Jan 30, 2025 • edited Loading

Test Results

silverweed commented Feb 4, 2025

pcanal left a comment

Choose a reason for hiding this comment

silverweed commented Jan 30, 2025 •

edited

Loading

github-actions bot commented Jan 30, 2025 •

edited

Loading