Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Outlier Detection to use stcal #1357

Merged
merged 17 commits into from
Oct 2, 2024

Conversation

braingram
Copy link
Collaborator

@braingram braingram commented Aug 8, 2024

Closes #1302
Closes #1423
Fixes RCAL-926

This PR updates outlier detection to use common code in stcal. It largely mirrors the code in spacetelescope/jwst#8840

This PR also removes a few unused items from the outlier detection step spec:

  • nlow
  • nhigh
  • grow
  • allowed_memory
  • kernel_size

And updates the input handling to raise an exception (instead of issuing a warning) for invalid input (as described in #1302).

The docs updates in this PR reference the step spec rather than specifying defaults in the docs (which are currently out of sync with the code).
Link to page which has most of the docs updates: https://roman-pipeline--1357.org.readthedocs.build/en/1357/roman/outlier_detection/arguments.html

Regression tests all pass https://github.com/spacetelescope/RegressionTests/actions/runs/11114749475

Checklist

  • added entry in CHANGES.rst under the corresponding subsection
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)
  • ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR

@github-actions github-actions bot added testing documentation Improvements or additions to documentation labels Aug 8, 2024
Copy link

codecov bot commented Aug 8, 2024

Codecov Report

Attention: Patch coverage is 92.52874% with 13 lines in your changes missing coverage. Please review.

Project coverage is 78.38%. Comparing base (b652ef8) to head (9a2df83).
Report is 18 commits behind head on main.

Files with missing lines Patch % Lines
romancal/resample/resample.py 80.76% 10 Missing ⚠️
romancal/outlier_detection/utils.py 96.66% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1357      +/-   ##
==========================================
- Coverage   78.66%   78.38%   -0.28%     
==========================================
  Files         117      118       +1     
  Lines        7724     7680      -44     
==========================================
- Hits         6076     6020      -56     
- Misses       1648     1660      +12     
Flag Coverage Δ *Carryforward flag
nightly 62.12% <ø> (ø) Carriedforward from b652ef8

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

CHANGES.rst Outdated
Comment on lines 52 to 91
- Remove unused arguments to outlier detection. [#1357]

- Update input handling to raise an exception on an invalid input instead
of issuing a warning and skipping the step. [#1357]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Remove unused arguments to outlier detection. [#1357]
- Update input handling to raise an exception on an invalid input instead
of issuing a warning and skipping the step. [#1357]

now that #1375 is merged (switching change log handling to towncrier) this change log entry should be a file in changes/ instead:

echo "Remove unused arguments to outlier detection. \n\n Update input handling to raise an exception on an invalid input instead of issuing a warning and skipping the step." > changes/1357.outlier_detection.rst

(this is just because #1375 was merged while active PRs existed; new PRs will have these instructions in their checklist)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog updated to new format. Let me know if this looks good to you.

@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Sep 30, 2024
pyproject.toml Outdated
"stcal @ git+https://github.com/spacetelescope/stcal.git@main",
# "stcal>=1.8.0,<1.9.0",
# "stcal @ git+https://github.com/spacetelescope/stcal.git@main",
"stcal @ git+https://github.com/emolter/stcal.git@JP-3768",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily set to source branch for spacetelescope/stcal#292 for testing that (and this) PR.

@braingram braingram changed the title Outlier detection cleanup args Update Outlier Detection to use stcal Sep 30, 2024
@braingram braingram marked this pull request as ready for review September 30, 2024 21:56
@braingram braingram requested a review from a team as a code owner September 30, 2024 21:56
@braingram
Copy link
Collaborator Author

braingram commented Sep 30, 2024

For the small (3 input) association described in more detail https://innerspace.stsci.edu/display/SCSB/Romancal+ModelLibrary+Performance the memory profile of a run of the mosaic pipeline with this PR is:
Screenshot 2024-09-30 at 6 09 56 PM
Outlier detection (roughly the period between 18:03:30 until the dip before 18:04:30)

  • no longer produces the peak usage for the pipeline run
  • consumes 9.6 GB of memory (vs 13.1 in the "post library" run)
  • takes ~1 minute (vs 3 minutes in the "post library" run)

@braingram
Copy link
Collaborator Author

Pinging @emolter since I can't request you as a reviewer. If you have a chance to look over this PR that would be greatly appreciated!

Copy link
Contributor

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great aside from some minor questions and comments Brett!

Are there any regression tests that save intermediate files from the OutlierDetectionStep? I don't see any regtest updates here, and I know in JWST we kept running into problems with intermediate filenames partially because those lacked test coverage

docs/roman/outlier_detection/arguments.rst Outdated Show resolved Hide resolved
docs/roman/outlier_detection/arguments.rst Outdated Show resolved Hide resolved
docs/roman/outlier_detection/arguments.rst Show resolved Hide resolved
docs/roman/outlier_detection/arguments.rst Outdated Show resolved Hide resolved
docs/roman/outlier_detection/arguments.rst Outdated Show resolved Hide resolved
romancal/outlier_detection/utils.py Show resolved Hide resolved
log.info(f"{np.count_nonzero(cr_mask)} pixels marked as outliers")


def detect_outliers(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably a nitpick, but is the detect_outliers function really a utility that belongs in utils?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled to come up with a way to organize/name this so suggestions are welcome. Putting it in "imaging" (like in jwst) doesn't make sense here (as there's no non-imaging modes). Putting it in another "outlier_detection" submodule is overly redundant ("romancal.outlier_detection.outlier_detection.detect_outliers"). So I gave up and went with the standard "utils".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that many lines. Is there a reason why it needs to be its own function, given that there is only one outlier detection mode for romancal? That is, can you just embed the code into OutlierDetectionStep?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that. I believe the convention for romancal (as described by @mairanteodoro ) is to have the steps handle the stpipe-specific stuff and s much of the "meat" of the processing occur in a separate function/class/utility. If it's ok with you let's see what he thinks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good it's definitely nbd either way as far as I'm concerned

romancal/outlier_detection/utils.py Show resolved Hide resolved
romancal/resample/resample.py Show resolved Hide resolved

Used for outlier detection
indices : list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a one-line description of what the indices correspond to (I realize I didn't do any of these docstring updates for the jwst code either...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does b8ba8af look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good except (nitpick) the second instance of "and will not" could be changed to "nor"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this as-is in part because I can't think of a good way to fit in a "nor". I know it's considered a universal gate so it should be able to do anything (bad joke)!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@braingram braingram force-pushed the outlier_detection_cleanup branch 2 times, most recently from e6ca79e to e3f7956 Compare October 1, 2024 16:04
Copy link
Contributor

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of my comments have been answered. I'm still wondering though whether there is test coverage of the intermediate files somewhere in the regtests.

@braingram
Copy link
Collaborator Author

All of my comments have been answered. I'm still wondering though whether there is test coverage of the intermediate files somewhere in the regtests.

Thanks for taking a look and for the reminder on that point. There's a unit test that saves intermediate results:
https://github.com/spacetelescope/romancal/pull/1357/files#diff-efb2f2b9bafb3583972a4d54071f5cf19254a042c426746575536554269d1290R115

Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. A few questions, though:

  • FWIW, either accepting weight_type as ivm with a default of IVM or passing it through correctly works fine for me. We do want to be using that but we don't need to do that through hard coding. This is especially true for the _median_without_resampling case, which we don't expect to use much in production.
  • It's great to move the common code to stcal and to improve performance. It looks to me like it's the case that the new outlier_detection.utils module contains no Roman-specific code, except maybe through the ModelLibrary infrastructure. Is that a target of future stcal common code consolidation?

@braingram
Copy link
Collaborator Author

Thanks!

This looks good to me. A few questions, though:

* FWIW, either accepting weight_type as ivm with a default of IVM or passing it through correctly works fine for me.  We do want to be using that but we don't need to do that through hard coding.  This is especially true for the _median_without_resampling case, which we don't expect to use much in production.

Sorting out the weight handling sounds great. I opened an issue to track that #1428 but I think it would be a little more involved. I don't expect we'll see regtest differences but I do think adding some unit tests (to verify that changing the weight type has the intended effect) is useful.

* It's great to move the common code to stcal and to improve performance.  It looks to me like it's the case that the new outlier_detection.utils module contains no Roman-specific code, except maybe through the ModelLibrary infrastructure.  Is that a target of future stcal common code consolidation?

Good point! I think there is more that could be done here once resample is moved to stcal. At the moment the link between outlier detection and resample puts some limitations on what can be pulled out of jwst/romancal.

Copy link
Collaborator

@zacharyburnett zacharyburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more abstraction! the changelog looks good to me

@braingram braingram merged commit 8601da8 into spacetelescope:main Oct 2, 2024
30 of 31 checks passed
@braingram braingram deleted the outlier_detection_cleanup branch October 2, 2024 14:12
@nden nden added this to the 25Q1_B16 milestone Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation testing
Projects
None yet
5 participants