Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClusterCell collection lacking cellID encondings, or including cells from heterogeneous subsystems with different CellIDEncodings #84

Open
giovannimarchiori opened this issue Jun 18, 2024 · 5 comments

Comments

@giovannimarchiori
Copy link
Contributor

giovannimarchiori commented Jun 18, 2024

The clustered cell collection (https://github.com/HEP-FCC/k4RecCalorimeter/blob/main/RecCalorimeter/src/components/CreateCaloClustersSlidingWindow.cpp#L45C8-L45C23) can potentially hold cells from heterogeneous subsystems which then have inconsistent CellIDEncodings. The issue and further details are extensively described and discussed in #83

In addition, even if only one system is used, the cell collection attached to the cluster does not have the CellIDEncoding. This is indeed potentially a problem (analysis scripts later in the chain may need this information to properly set-up the decoder).

@BrieucF
Copy link
Contributor

BrieucF commented Jun 21, 2024

Among the two options discussed (have similar cellID encodings for the different sub-systems or having one cluster cell collection per subsystem) I would be in favor of the latter. What do you think?

@giovannimarchiori
Copy link
Contributor Author

just a reminder to @BrieucF and @tmadlener about the discussion we had in the past about this and some options discussed in #83 now that they are reviving the discussions on how to best handle this with podio/key4hep

@tmadlener
Copy link
Contributor

Just to be sure that I understand tho goal correctly: There is a ClusterCollection acting as a superset collection (subset collection in podio terms) that contains Clusters from several subsystems that have potentially different CellID encodings. Is this correct? Or is this combined collection actually a proper collection of cloned Clusters?

Regardless of which is the case for the collection, it's necessary to get to the correct CellID encoding for each cluster in order to interpret the CellID correctly.

The main problem currently is that the combined / superset collection doesn't have any metadata set, resp. if we would set a CellID encoding for that collection we could only set one (assuming we use the usual convention to set that). However, how we can solve this depends quite a bit on whether the collection is a subset collection or whether it is a proper collection.

@giovannimarchiori
Copy link
Contributor Author

Hi @tmadlener
What we have now is a ClusterCollection (which is not a subset collection) that contains clusters whose cells are from more than 1 system (e.g. ECAL + HCAL combined clusters), with different CellID encodings.

One could get the proper CellID by encoding in the metadata a dict of keys of type system : encoding, based on the assumption that system is always encoded in the first 4 bits of the cellID no matter the details of the readout, but that is neither elegant nor probably very robust and would maybe also break some design assumptions of podio.

By the way I was wondering after reading your remarks: do we have the possibility of creating (super)clusters of clusters? It could be interesting for instance for doing brem-recovery, but might also be a solution here if we persist our ECAL+HCAL cluster as just a cluster of 2 clusters, and the ECAL and HCAL clusters are stored in their own ClusterCollections with their metadata containing the proper cellID info.

@tmadlener
Copy link
Contributor

Reading your comment, I think we are potentially coming to a similar situation with tracking detectors. Currently the assumption is that the CellID encoding for the tracking system is defined globally for the full detector and all subdetectors have the same. However, e.g. in the ILD/CLD hybrid for FCCee the two global definitions do not agree and it might be beneficial to only have the overall layout (i.e. the structure of the encoding but not the individual bit widths) defined. In that case some of the leading parts will probably have to agree.

I don't necessarily see a problem with that. In the end it will be a convention that we will have to document and make sure that we follow it. From a robustness point of view, we could start to think about coming up with some checks to make sure that the parts that need to agree.

Overall, I think this would be worth a dedicated discussion to see if we can come up with a single solution rather than different solutions for calorimeters and trackers for example.


would maybe also break some design assumptions of podio

I don't think we are running into podio (technical or design) limitations here, but rather into some other assumptions / conventions that we have come up with.

do we have the possibility of creating (super)clusters of clusters?

That shouldn't be a problem, as edm4hep::Clusters define a OneToManyRelation to other edm4hep::Clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants