Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutually mask SDR inputs to avoid problems routing #1435

Merged
merged 10 commits into from
Nov 17, 2023

Conversation

phargogh
Copy link
Member

This PR addresses two issues that are actually pointing to the same problem, which is that the user is missing data where the DEM is valid. I settled on solving this by masking the user's input to avoid needing to support users who don't read any documentation about this. No change should be needed in the User's Guide as a result of this change.

Fixes #1433
Fixes #911

@emlys and @dcdenu4, both of you were involved in the discussion in #911 , so I wanted to be sure you saw this.

@newtpatrol, feel free to review if you would like! I mostly wanted to include you here as an FYI.

Checklist

  • Updated HISTORY.rst and link to any relevant issue (if these changes are user-facing)
  • Updated the user's guide (if needed)
    - [ ] Tested the Workbench UI (if relevant)

@phargogh phargogh self-assigned this Oct 23, 2023
@phargogh
Copy link
Member Author

This solution is waiting on the next release of pygeoprocessing.

Copy link
Member

@dcdenu4 dcdenu4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good, thanks @phargogh! I know this is still a draft, so I'll wait to approve.

Comment on lines 642 to 644
func=pygeoprocessing.raster_map,
kwargs={
'op': lambda array, mask: array,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used raster_map much yet, but this is cool to see!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely! Masking is such an easy thing to express within the context of raster_map, it was just too good to pass up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, using a lambda like this will break when n_workers > 0. I just discovered this issue while working on #1437. Because lambdas can't be pickled, they can't be directly passed into a taskgraph Task. op has to be a named function defined at the top level of the module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, of course! Thanks for the reminder. Patched.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phargogh should this be _mask_op now in kwargs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I missed one lambda and patched it in f33cf70

@phargogh
Copy link
Member Author

This is waiting on #1437

Comment on lines 642 to 644
func=pygeoprocessing.raster_map,
kwargs={
'op': lambda array, mask: array,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, using a lambda like this will break when n_workers > 0. I just discovered this issue while working on #1437. Because lambdas can't be pickled, they can't be directly passed into a taskgraph Task. op has to be a named function defined at the top level of the module.

@phargogh phargogh requested review from emlys and dcdenu4 November 4, 2023 00:58
@phargogh phargogh marked this pull request as ready for review November 4, 2023 00:58
Copy link
Member

@dcdenu4 dcdenu4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @phargogh , just one item related to Emily's comment.

Comment on lines 642 to 644
func=pygeoprocessing.raster_map,
kwargs={
'op': lambda array, mask: array,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phargogh should this be _mask_op now in kwargs?

@phargogh phargogh requested a review from dcdenu4 November 6, 2023 16:14
Copy link
Member

@dcdenu4 dcdenu4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @phargogh, there are some failing tests that indicate something might have gotten out of place...

@newtpatrol
Copy link
Contributor

I just want to make sure I understand what "mutually masking" means. Are you overlaying all of the raster inputs, and anywhere any of the input layers have NoData, set the rest of the input layers to NoData, including the DEM? And this would be done before any USLE/SDR calculations are begun?

This seems fine to me, and we do already say that anywhere there's NoData in the inputs there will be NoData in the outputs. It will be interesting to see what kind of effect doing this has on outputs that rely on upslope/downslope area, which may now be missing data - seems like a lot of NoData propagation might happen. Depending on how this shakes out, I might end up writing more explanation and provide a visual example in the UG.

It also brings to the forefront to make sure that we advise on how to fill NoData in all of the input layers that are likely to have them. I think most of this is already done in the UG, but probably not for all relevant inputs.

@phargogh
Copy link
Member Author

phargogh commented Nov 7, 2023

I just want to make sure I understand what "mutually masking" means. Are you overlaying all of the raster inputs, and anywhere any of the input layers have NoData, set the rest of the input layers to NoData, including the DEM? And this would be done before any USLE/SDR calculations are begun?

@newtpatrol yes, if this change is merged, we are overlaying all of the aligned raster inputs and anywhere any of the aligned layers have a nodata pixel, the whole stack is nodata. Then we use that mask on all of the aligned layers. This guarantees a consistent stack of operable pixels. Yes, this would be done before any calculations are done, including USLE/SDR.

Copy link
Member

@emlys emlys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @phargogh! This is a really nice and simple solution to a complex bug

@dcdenu4 dcdenu4 merged commit 49990e0 into natcap:main Nov 17, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large swaths of nodata in SDR SDR: sed_deposition has large swaths of nodata in Gura sample data
4 participants