fix(ingest): Enforce each metadata field either being grouped or segmented #2370
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #2308
preview URL: https://ingest-fix-grouping-issue.loculus.org/
Issue
usually_identical_fields
cannot handle metadata types of type other than string. At this stage of ingest there are only 3 fields which fall into this category anyways: 'ncbi_sourcedb' 'ncbi_virus_tax_id', 'ncbi_virus_name'. Other these 'ncbi_virus_tax_id' is of type int. 'ncbi_virus_name' and 'ncbi_virus_tax_id' (with the exception of influenza) will always be the same per group. It is also unlikely that 'ncbi_sourcedb' will be different.As handling such edge cases is anyways complicated I think we should do this at the moment with manual curation and just now assign these 3 fields either to group or segment.
Summary
Remove
usually_identical_fields
option, of the 3 fields in this field (see above) all will go to grouped.Screenshot