Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicas are still missing for HCA #6597

Open
nadove-ucsc opened this issue Sep 28, 2024 · 0 comments
Open

Replicas are still missing for HCA #6597

nadove-ucsc opened this issue Sep 28, 2024 · 0 comments
Labels
- [priority] Medium bug [type] A defect preventing use of the system as specified indexer [subject] The indexer part of Azul orange [process] Done by the Azul team

Comments

@nadove-ucsc
Copy link
Contributor

nadove-ucsc commented Sep 28, 2024

Follow-up from #6582

The linked issue adds replicas that were previously missing for many HCA entities, such as donors and some protocols. However, there are still HCA entities that are not being replicated. There are two distinct cases:

  1. Entities that are linked to a file, but are not replicated because they are not tracked while traversing the links. An example is the dissociation_protocol in canned bundle for test_indexing.
  2. Entities that are not linked to any file in their bundle.

The solution for case 1 is to modify the TransformerVisitor class to track all linked entities it encounters, potentially consolidating all currently untracked entities in a single data structure. These entities will then be emitted as replicas by the FileTransformer.

The solution for case 2 is to modify the ProjectTransformer to emit a replica for every entity in its bundle. The hub IDs for these replicas will not include any file IDs. Duplicate replicas will be merged by the index service before any replicas are written to ElasticSearch.

This design depends on the current implementation of the linked ticket, as in #6584

@nadove-ucsc nadove-ucsc added orange [process] Done by the Azul team bug [type] A defect preventing use of the system as specified indexer [subject] The indexer part of Azul - [priority] Medium labels Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- [priority] Medium bug [type] A defect preventing use of the system as specified indexer [subject] The indexer part of Azul orange [process] Done by the Azul team
Projects
None yet
Development

No branches or pull requests

1 participant