Add common resample code to stcal #279

mcara · 2024-08-08T13:37:28Z

This PR add the common resample code used by both JWST and Roman pipelines to stcal. Also, for the first time, this PR adopts the new drizzle API from spacetelescope/drizzle#134 for the resample code used in the pipelines.

This work is related to https://jira.stsci.edu/browse/AL-835

At this moment this is a very rough draft for illustration purpose. It should run with default arguments (except input_models and output file name can be specified; everything else is not guaranteed to work). There are no unit/regression tests and documentation may not match the code.

Checklist

added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
updated relevant tests
updated relevant documentation
updated relevant milestone(s)
added relevant label(s)

codecov · 2024-08-08T13:42:20Z

Codecov Report

Attention: Patch coverage is 1.91939% with 511 lines in your changes missing coverage. Please review.

Project coverage is 80.54%. Comparing base (2112773) to head (a6ffe26).
Report is 148 commits behind head on main.

Files with missing lines	Patch %	Lines
src/stcal/resample/resample.py	1.73%	510 Missing ⚠️
src/stcal/resample/__init__.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #279      +/-   ##
==========================================
- Coverage   84.32%   80.54%   -3.79%     
==========================================
  Files          41       48       +7     
  Lines        7529     9325    +1796     
==========================================
+ Hits         6349     7511    +1162     
- Misses       1180     1814     +634

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kmacdonald-stsci · 2024-08-08T16:16:34Z

src/stcal/resample/resample.py

+
+    @abc.abstractmethod
+    def run(self):
+        ...


Instead of using ellipses, would it be better to raise and exception with a message saying "method not implemented"?

by marking it as an abstractmethod this is done automatically if you try to call it directly

kmacdonald-stsci · 2024-08-08T16:27:32Z

src/stcal/resample/utils.py

Is this file necessary? It is imported by only one other file. It is not a set of common functions shared among many files. It's two short functions.

braingram · 2024-08-23T14:18:20Z

Would you rebase this and add a drizzle dependency with the branch for spacetelescope/drizzle#134 so we can see the CI run?

braingram · 2024-08-23T15:01:37Z

src/stcal/resample/utils.py

@@ -0,0 +1,36 @@
+import numpy as np
+from stdatamodels.dqflags import interpret_bit_flags


Suggested change

from stdatamodels.dqflags import interpret_bit_flags

from stdatamodels.dqflags import interpret_bit_flags

Is there a non-stdatamodels way to handle this. We can't make stdatamodels a dependency of this code.

yes, there is. I'll fix it (I forgot to switch to astropy although stdatamodels.dqflags.interpret_bit_flags() for unknown to me reasons does things slightly different. I think that function should be removed altogether from stdatamodels.dqflags.

oh, I did it in resample.py but not in utils.

braingram · 2024-08-23T15:10:09Z

src/stcal/resample/resample.py

+
+        self.final_post_processing()
+
+        self._output_model.write(self._output_filename, overwrite=True)


This won't work for roman since write does not exist for roman_datamodels.DataModel and overwrite is not a valid keyword argument for roman_datamodels.DataModel.save.

I think it's best if we leave all file-IO up to the pipeline and not include it here in stcal.

that was an omission. I'll fix this.

emolter

my main question is still the extent to which we assume the structure and attributes of datamodels, and now also of ModelLibrary

emolter · 2024-08-23T15:15:12Z

src/stcal/resample/resample.py

+    """Raised when the output is too large for in-memory instantiation"""
+
+
+def output_wcs_from_input_wcs(input_wcs_list, pixel_scale_ratio=1.0,


I might put this somewhere other than resample.py, e.g. alignment/utils or alignment/resample_utils. I would vote (eventually) for having a separate submodule for WCS-related utility functions, but that's probably beyond the PR scope

Agree. It was temporary here. This will be fixed once we have a function that computes output WCS from s_region of input data models and other parameters.

emolter · 2024-08-23T15:25:03Z

src/stcal/resample/resample.py

+
+        # loop over only science exposures in the ModelLibrary
+        # sci_indices = self._input_models.ind_asn_type("science")
+        with self._input_models:


We seem to be in the same infinite loop as before here, about whether or not to make certain datamodels dependencies of stcal, either implicitly or explicitly. It seems to me that with this and other code lines, ResampleBase can only be used if the input models are assumed to be ModelLibrary instances. Are we okay with this? If so, should we just make this explicit in some way by specifying the input model type inside the ResampleBase abstract base class, and make stpipe a dependency?

We seem to be in the same infinite loop as before here, about whether or not to make certain datamodels dependencies of stcal, either implicitly or explicitly. It seems to me that with this and other code lines, ResampleBase can only be used if the input models are assumed to be ModelLibrary instances.

True. It is difficult to maximize the common code ported to stcal with the purpose of reducing maintenance burden in the pipelines given the rigid usage imposed by the ModelLibrary (by contrast, ModelContainer was behaving like a standard list). I can try to see how to bury ModelLibrary into the "IO" class, but it's not going to be pretty.

Are we okay with this? If so, should we just make this explicit in some way by specifying the input model type inside the ResampleBase abstract base class, and make stpipe a dependency?

I don't know what are the plans for these packages but I do not think they are intended to be general purpose algorithms like those in astropy (or even drizzle) so I do not see an issue with using structures shared by all pipelines.

emolter · 2024-08-23T15:26:21Z

src/stcal/resample/resample.py

+
+                try:
+                    if self.get_model_attr_value(model, "exptype").upper() != "SCIENCE":
+                        self._input_models.shelve(model, modify=False)


usually when there's a try...except clause inside a ModelLibrary context where the try could fail while models are borrowed from the library, it's necessary to put the shelve into a finally block so it executes regardless of the path the try...except takes

actually, if an exception is triggered then there is some other processing after the try-except block and then the model is closed. So, I think, the code is fine: this was intentional.

src/stcal/resample/resample.py

emolter · 2024-08-23T15:35:41Z

src/stcal/resample/resample.py

+    """
+    resample_suffix = 'i2d'
+    resample_file_ext = '.fits'
+    n_arrays_per_output = 2  # #flt-point arrays in the output (data, weight, var, err, etc.)


why is this hard-coded? Will it be different for co-add and single resamples? Don't we have multiple types of variance that need to be in memory simultaneously in the current version of resample, i.e., doesn't this underestimate the memory usage when it's used in check_memory_requirements?

emolter · 2024-08-23T15:37:20Z

src/stcal/resample/resample.py

+    def build_driz_weight(self, model, weight_type=None, good_bits=None):
+        """Create a weight map for use by drizzle
+        """
+        data = self.get_model_array(model, "data")


again here certain attributes of the model are assumed to exist, and IMO this should be made explicit either by having a model base class or by adding stcal dependencies (although we've been trying hard to avoid the latter)

again here certain attributes of the model are assumed to exist, and IMO this should be made explicit either by having a model base class or ...

Seems like this is not going to happen.

... by adding stcal dependencies (although we've been trying hard to avoid the latter)

I don't think this is necessary as this PR illustrates how to avoid imports.

Yes, it is true thst we assume some attributes to exist. I think for now we are safe but if you would like to harden the code, I could add try-except blocks around these data retrieval statements.

emolter · 2024-08-23T15:40:02Z

src/stcal/resample/resample.py

+
+    @abc.abstractmethod
+    def run(self):
+        ...


by marking it as an abstractmethod this is done automatically if you try to call it directly

emolter · 2024-08-23T15:45:32Z

src/stcal/resample/resample.py

+
+    Notes
+    -----
+    This routine performs the following operations::


this looks identical to ResampleCoAdd. Some doc updates should be made to clarify how they are different

Move common resample code to stcal

07c0757

mcara requested a review from a team as a code owner August 8, 2024 13:37

mcara marked this pull request as draft August 8, 2024 13:37

mcara self-assigned this Aug 8, 2024

mcara mentioned this pull request Aug 8, 2024

Move common resample code to stcal spacetelescope/jwst#8695

Draft

8 tasks

fix method names

d3439fc

kmacdonald-stsci approved these changes Aug 8, 2024

View reviewed changes

mcara added 3 commits August 14, 2024 10:02

Remove dependencies on stdatamodels, etc.

f627731

do not access model attributes directly

3360c7b

fix use of model attributes

6612d3a

braingram reviewed Aug 23, 2024

View reviewed changes

emolter reviewed Aug 23, 2024

View reviewed changes

mcara added 3 commits August 26, 2024 02:01

Address reviewer comments

a0e07a5

Refactor previous code to work with arrays only

5da615d

refactor

a6ffe26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add common resample code to stcal #279

Add common resample code to stcal #279

mcara commented Aug 8, 2024

codecov bot commented Aug 8, 2024 •

edited

Loading

kmacdonald-stsci Aug 8, 2024

emolter Aug 23, 2024

kmacdonald-stsci Aug 8, 2024

braingram commented Aug 23, 2024

braingram Aug 23, 2024

mcara Aug 23, 2024

mcara Aug 23, 2024

braingram Aug 23, 2024 •

edited

Loading

mcara Aug 23, 2024

emolter left a comment

emolter Aug 23, 2024

mcara Aug 25, 2024

emolter Aug 23, 2024

mcara Aug 25, 2024

emolter Aug 23, 2024

mcara Aug 26, 2024

emolter Aug 23, 2024

emolter Aug 23, 2024

mcara Aug 26, 2024

emolter Aug 23, 2024

emolter Aug 23, 2024

		@@ -0,0 +1,36 @@
		import numpy as np
		from stdatamodels.dqflags import interpret_bit_flags


		self.final_post_processing()

		self._output_model.write(self._output_filename, overwrite=True)

		"""Raised when the output is too large for in-memory instantiation"""


		def output_wcs_from_input_wcs(input_wcs_list, pixel_scale_ratio=1.0,

Add common resample code to stcal #279

Are you sure you want to change the base?

Add common resample code to stcal #279

Conversation

mcara commented Aug 8, 2024

codecov bot commented Aug 8, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

braingram commented Aug 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

braingram Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emolter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 8, 2024 •

edited

Loading

braingram Aug 23, 2024 •

edited

Loading