Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingest): Enforce each metadata field either being grouped or segmented #2370

Merged
merged 3 commits into from
Aug 14, 2024

Conversation

anna-parker
Copy link
Contributor

@anna-parker anna-parker commented Aug 2, 2024

resolves #2308

preview URL: https://ingest-fix-grouping-issue.loculus.org/

Issue

usually_identical_fields cannot handle metadata types of type other than string. At this stage of ingest there are only 3 fields which fall into this category anyways: 'ncbi_sourcedb' 'ncbi_virus_tax_id', 'ncbi_virus_name'. Other these 'ncbi_virus_tax_id' is of type int. 'ncbi_virus_name' and 'ncbi_virus_tax_id' (with the exception of influenza) will always be the same per group. It is also unlikely that 'ncbi_sourcedb' will be different.

As handling such edge cases is anyways complicated I think we should do this at the moment with manual curation and just now assign these 3 fields either to group or segment.

Summary

Remove usually_identical_fields option, of the 3 fields in this field (see above) all will go to grouped.

Screenshot

@anna-parker anna-parker added the preview Triggers a deployment to argocd label Aug 2, 2024
@anna-parker anna-parker changed the title Ingest fix grouping issues fix(ingest): Enforce each metadata field either being grouped or segmented Aug 2, 2024
@anna-parker anna-parker marked this pull request as ready for review August 5, 2024 13:24
Copy link
Member

@theosanderson theosanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRs that remove lines are great :D

@anna-parker
Copy link
Contributor Author

@corneliusroemer is this ok with you? I know you had some other ideas for how to handle this issue

@anna-parker
Copy link
Contributor Author

I will merge this - if later on we want to add this back we can revert this PR :-)

@anna-parker anna-parker merged commit 8a52d45 into main Aug 14, 2024
10 checks passed
@anna-parker anna-parker deleted the ingest_fix_grouping_issues branch August 14, 2024 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Triggers a deployment to argocd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ingest/ Grouping: How to handle fields that are usually identical but can sometimes vary?
2 participants