Open
Conversation
Document the metadata URL and every TSV column mapped in transform_to_c2m2, grouped by document level (File, Collection, Biosample, Subject, DCC) with enriched subsections for extra fields.
Document the Search API URLs, entity matching patterns, and every API field mapped in the file and collection enrichment passes, grouped by document level with enriched subsections for DCC-specific fields.
Document the Search API URLs, entity matching strategy, and every API field mapped across the file, collection, and subject enrichment passes, grouped by document level with enriched subsections for DCC-specific fields.
conradbzura
commented
Feb 20, 2026
| group_name → collections[].extra.hubmap.group_name | ||
| visualization → collections[].extra.hubmap.visualization | ||
| vitessce-hints → collections[].extra.hubmap.vitessce_hints | ||
| metadata → collections[].extra.hubmap.metadata |
Collaborator
Author
There was a problem hiding this comment.
We don't populate the promoted file_type_detailed field for HuBMAP, but this information seems like it can help with visualization.
nvictus
reviewed
Feb 20, 2026
| File | ||
| ~~~~ | ||
| File accession → local_id | ||
| File download URL → access_url, filename (derived) |
Member
There was a problem hiding this comment.
This is fine. Note that there is also a s3_uri key that could be a substitute or fallback. It might be more performant too.
Member
There was a problem hiding this comment.
Just noticed that they provide azure URLs too now. And you include both in enriched. Good!
nvictus
reviewed
Feb 20, 2026
| Size → size_in_bytes | ||
| md5sum → md5 | ||
| File Status → status | ||
| Experiment date released → creation_time |
Member
There was a problem hiding this comment.
Note: check if "release" date in 4DN also maps to this key
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added field mappings for scraped data for each consortium as service-module-level docstrings.