DAS-2368: Configure & support dimension variables for SMAP L3 at the root level#36
DAS-2368: Configure & support dimension variables for SMAP L3 at the root level#36joeyschultz merged 9 commits intomainfrom
Conversation
| } | ||
|
|
||
| return var_dim_map | ||
| # Identify dimensions that have been created up-level and update map |
There was a problem hiding this comment.
As the docstring notes, this code for updating the var_dim_map is only temporary required until earthdata-varinfo supports up-level dimensions.
flamingbear
left a comment
There was a problem hiding this comment.
This is mostly questions for you to educate me. I have a couple of suggestions. But it's very close, I did run the test requests and looked at the outputs. Nice.
| "Mission": "SMAP", | ||
| "ShortNamePath": "SPL3SMP$", | ||
| "VariablePattern": "/Soil_Moisture_Retrieval_Data_(AM|PM)/(?!lc_type$).+$" | ||
| "VariablePattern": "/Soil_Moisture_Retrieval_Data_(AM|PM)/.*" |
There was a problem hiding this comment.
and no lc_types? (are these gone now from exclusions?) looks like it's hoisted?
There was a problem hiding this comment.
See my explanation for why I removed the am_pm exclusion from the variable pattern in the above comment. The same applies here - just for lc_type in this case
metadata_annotator/annotate.py
Outdated
| for parent in node.parents: | ||
| dim_candidate_path = construct_dim_path(parent.path, dim) | ||
| if dim_candidate_path in variables_to_create: | ||
| parent.ds = parent.ds.assign(**{dim: dim_src}) |
There was a problem hiding this comment.
do you need to unpack that dict? couldn't you just assign it as a dict?
| parent.ds = parent.ds.assign(**{dim: dim_src}) | |
| parent.ds = parent.ds.assign({dim: dim_src}) |
metadata_annotator/annotate.py
Outdated
| for node in datatree.subtree | ||
| for dim in node.ds.dims | ||
| if (dim_full_path := construct_dim_path(node.path, dim)) in variables_to_create | ||
| } |
There was a problem hiding this comment.
I can't tell if this is clever or I'm just dense because I didn't know the walrus operator. I'm almost tempted to say it should just be nested loops for readability, but maybe I'm off.
There was a problem hiding this comment.
Discussed with Matt and decided to use nested for loop to simplify things. I had tried making the set comprehension work but I don't think its worth the complexity.
Updated in ae7d2bf.
| } | ||
|
|
||
| return var_dim_map | ||
| # Identify dimensions that have been created up-level and update map |
There was a problem hiding this comment.
I'm looking at the original and I don't understand what it does or how it's used.
It looks like it just get all of the dimension pairs, but then truncates all but one value from the list of variables.
L139-L142. I'm baffled why we would do that. (git blame says it was me, but the ticket link is to some epic)
| f'{parent.path}/{dimension_name}' | ||
| if parent.path != '/' | ||
| else f'/{dimension_name}' |
There was a problem hiding this comment.
Didn't you have a helper for this somewhere?
There was a problem hiding this comment.
Discussed with Matt and landed on using an inner helper function for this case. I'm unable to import annotate::construct_dim_path due to circular imports and don't think its worth creating a new module to get around that because this invocation of construct_dim_path will go away once the varinfo update is made.
Updated in d17535f.
| "Value": "y X" | ||
| } | ||
| ], | ||
| "_Description": "" |
There was a problem hiding this comment.
Either add a description or delete this, I think.
There was a problem hiding this comment.
Updated in da31df7.
In this commit, I also made some updates to the function that uses this test varinfo config entry (test_update_dimension_variables) to properly test that we are renaming the dimensions. I also moved the text fixture datatree and varinfo object directly into the test because this was the only test using them.
| assert set( | ||
| datatree['/Freeze_Thaw_Retrieval_Data_Global/surface_flag'].dims | ||
| ) == set(['am_pm', 'y', 'x']) | ||
|
|
There was a problem hiding this comment.
are these at the top level now? could they be checked there? doesn't look like it.
There was a problem hiding this comment.
This assertion of the renaming was deleted because this PR removes the dimension renaming functionality from update_group_and_variable_attributes.
tests/unit/test_annotate.py
Outdated
| ) | ||
| def test_has_dimension_override(sample_varinfo_test06, variable_path, expected): | ||
| """Ensure variables correctly evaluated to contain non-null dimensions override.""" | ||
| assert has_dimension_override(sample_varinfo_test06, variable_path) == expected |
There was a problem hiding this comment.
This is a lot of indirection to get to what this test does. I don't have a suggestion, mostly complaints.
There was a problem hiding this comment.
To make the test easier to understand, I moved the test datatree and varinfo object directly into the test function. These test06 fixtures were not used by any other tests so I think this change to creating them directly in the test function is better.
Updated in da31df7
| with xr.open_datatree(sample_netcdf4_file_test07) as test_datatree: | ||
| assert ( | ||
| get_new_dimension_variables(test_datatree, variables_to_create) == expected | ||
| ) |
There was a problem hiding this comment.
I don't really understand this test.
It might help if there was a test that asked for more variables to create than exist on the file. because that passes too.
I'm not even sure if that' useful information or if that would ever happen
e.g.
(
{'/y_root', '/sub_group/y', '/missing/dimension/z'},
{'/y_root', '/sub_group/y'},
),There was a problem hiding this comment.
I agree that the suggested test case is more useful than what I previously had.
Updated in da31df7
| { | ||
| "Name": "dimensions", | ||
| "Value": "y X" | ||
| "Value": "y x" |
flamingbear
left a comment
There was a problem hiding this comment.
Thanks for making all those changes. This looks good. Hopefully we can yank some of it out soon.
Description
This PR adds the ability to create up-level (shared) dimensions.
The varinfo configuration was modified to create shared dimensions at the root level for:
Jira Issue ID
DAS-2368
Local Test Steps
Check out this branch and build the new image and run the tests
Ensure your HIAB configuration has:
Restart harmony to pick up the new image:
Run the SPL3SMP_E bbox request in UAT to see the current behavior (dimensions created in each group):
https://harmony.uat.earthdata.nasa.gov/C1268399578-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&subset=lat(60%3A85)&subset=lon(-70%3A-10)&label=harmony-py&maxResults=1&variable=Soil_Moisture_Retrieval_Data_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_PM%2Fsurface_flag_pm&variable=Soil_Moisture_Retrieval_Data_Polar_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_Polar_PM%2Fsurface_flag_pm
Now run the same request on localhost to see the change (shared dimensions created up-level):
http://localhost:3000/C1268399578-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&subset=lat(60%3A85)&subset=lon(-70%3A-10)&label=harmony-py&maxResults=1&variable=Soil_Moisture_Retrieval_Data_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_PM%2Fsurface_flag_pm&variable=Soil_Moisture_Retrieval_Data_Polar_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_Polar_PM%2Fsurface_flag_pm
Accompanying regression test PRs:
PR Acceptance Checklist
CHANGELOG.mdupdated to include high level summary of PR changes.docker/service_version.txtupdated if publishing a release.