[ntuple] Make RNTupleJoinProcessor composable#18224
[ntuple] Make RNTupleJoinProcessor composable#18224enirolf merged 12 commits intoroot-project:masterfrom
RNTupleJoinProcessor composable#18224Conversation
6a984c6 to
9ec9623
Compare
Test Results 18 files 18 suites 4d 9h 49m 34s ⏱️ Results for commit affbe33. ♻️ This comment has been updated with latest results. |
9ec9623 to
d2ea410
Compare
hahnjo
left a comment
There was a problem hiding this comment.
The four cleanup commits at the end are great! Some comments for consideration on the entry handling.
How does it fail? (hopefully 'loudly' :) ). |
This is needed to properly handle join tables for chains of RNTuples.
Becasue it only returns the first entry index it encounters, subsequent entry mappings (and partitions) are skipped, giving a (potentially significant) performance improvement when only one entry index (without any further constraints) is required.
As a first approximation, use the default partition. This might be changed as things get further optimized.
d2ea410 to
d7e3e59
Compare
With an exception, is that loud enough :D? I've added a test for it as well: root/tree/ntuple/test/ntuple_processor.cxx Lines 375 to 391 in d7e3e59 |
vepadulano
left a comment
There was a problem hiding this comment.
Thanks a lot! Some minor comments for now
In turn also makes the whole `RNTupleProcessor` engine composable, i.e., it is now possible to create chains of joins, joins of chains, chains of chains, etc. In this initial implementation has the restriction that joins where values that are missing in the auxiliary data set are unsupported and will result in an exception. Proper support for these scenarios will be added later.
With the refactored join processor and the addition of `REntry::Reset`, this friendship is not necessary anymore.
It is only still called in `RNTupleSingleProcessor::Connect`, so the implementation has been moved into this method.
An individual RNTuple is now fully contained in the `RNTupleSingleProcessor`, removing the need to (re)connect fields. This makes the `RFieldContext` class redudant.
It is not used by the `RNTupleChainProcessor` or the `RNTupleJoinProcessor` anymore.
This was already implemented for the other processor subclasses, but not yet for this one.
d7e3e59 to
4792c94
Compare
Add a description of the behaviour of this method in case multiple entry indexes exist.
vepadulano
left a comment
There was a problem hiding this comment.
Congrats for this last step towards making the RNTupleProcessor composable! Maybe consider simplifying the commit history before merging
hahnjo
left a comment
There was a problem hiding this comment.
Thanks for addressing my comments!
This PR adds the possibility to compose
RNTupleJoinProcessors from existing processor objects. Similar functionality is already in place for theRNTupleChainProcessor(see #17393), so with these additions theRNTupleProcessoris (almost*) fully composable.*One caveat: the case where an auxiliary processor in the join is a join itself is not yet properly handled. This will be handled in a follow-up PR.