Skip to content

Add NRC imaging notebook #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Nov 18, 2024

Conversation

bhilbert4
Copy link
Collaborator

@bhilbert4 bhilbert4 commented Sep 5, 2024

This notebook checklist has been made available to us by the the Notebooks For All team.
Its purpose is to serve as a guide for both the notebook author and the technical reviewer highlighting critical aspects to consider when striving to develop an accessible and effective notebook.

The First Cell

  • The title of the notebook in a first-level heading (eg. <h1> or # in markdown).
  • A brief description of the notebook.
  • A table of contents in an ordered list (1., 2., etc. in Markdown).
  • The author(s) and affiliation(s) (if relevant).
  • The date first published.
  • The date last edited (if relevant).
  • A link to the notebook's source(s) (if relevant).

The Rest of the Cells

  • There is only one H1 (# in Markdown) used in the notebook.
  • The notebook uses other heading tags in order (meaning it does not skip numbers).

Text

  • All link text is descriptive. It tells users where they will be taken if they open the link.
  • All acronyms are defined at least the first time they are used.
  • Field-specific/specialized terms are used when needed, but not excessively.

Code

  • Code sections are introduced and explained before they appear in the notebook. This can be fulfilled with a heading in a prior Markdown cell, a sentence preceding it, or a code comment in the code section.
  • Code has explanatory comments (if relevant). This is most important for long sections of code.
  • If the author has control over the syntax highlighting theme in the notebook, that theme has enough color contrast to be legible.
  • Code and code explanations focus on one task at a time. Unless comparison is the point of the notebook, only one method for completing the task is described at a time.

Images

  • All images (jpg, png, svgs) have an image description. This could be

    • Alt text (an alt property)
    • Empty alt text for decorative images/images meant to be skipped (an alt attribute with no value)
    • Captions
    • If no other options will work, the image is decribed in surrounding paragraphs.
  • Any text present in images exists in a text form outside of the image (this can be alt text, captions, or surrounding text.)

Visualizations

  • All visualizations have an image description. Review the previous section, Images, for more information on how to add it.

  • Visualization descriptions include

    • The type of visualization (like bar chart, scatter plot, etc.)
    • Title
    • Axis labels and range
    • Key or legend
    • An explanation of the visualization's significance to the notebook (like the trend, an outlier in the data, what the author learned from it, etc.)
  • All visualizations and their parts have enough color contrast (color contrast checker) to be legible. Remember that transparent colors have lower contrast than their opaque versions.

  • All visualizations convey information with more visual cues than color coding. Use text labels, patterns, or icons alongside color to achieve this.

  • All visualizations have an additional way for notebook readers to access the information. Linking to the original data, including a table of the data in the same notebook, or sonifying the plot are all options.

@bhilbert4 bhilbert4 self-assigned this Sep 5, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@bhilbert4
Copy link
Collaborator Author

@drlaw1558 @sosey here is the NIRCam imaging mode notebook. It's essentially a copy of the NIRISS notebook from #18. Currently it downloads and processes one SW filter and one LW filter for a single program. One question I have though, is that I would like to address a question we get semi-regularly. That is: How do I resample data from multiple filters onto the same pixel grid. I could do that in a separate notebook in some other repo, or I could potentially add another filter to this notebook and do it here. Or swap out the LW data with a second SW filter?

Currently, with one SW filter and one LW filter, the notebook produces ~8GB of data, and takes ~20 minutes to run (on my laptop). Total data volume of reference files is about 71 GB.

@drlaw1558
Copy link
Collaborator

@bhilbert4 Thanks! Also tagging @rizeladiaz
71 GB of reference data is more than I'd expected- is that mostly darks from the different detectors? Could the LW data be resampled onto the same pixel grid as the SW data?

@bhilbert4
Copy link
Collaborator Author

@drlaw1558 Looks like I had some extra rate files sitting in the data directory from before I cut down the number of detectors. I was also double counting in a couple cases. The notebook now uses only 3 detectors. That translates into about 18GB of dark current data. All the other reference files only add another 1 GB or so.

The LW data could be resampled onto the same pixel grid as the SW, or vice versa, but that does involve some non-default settings when calling image3. I'm not sure non-default you want to get in these notebooks. Maybe I could run the SW and LW through image3 with default settings, and then call image3 a second time on the LW data in order to put it on the SW grid?

@drlaw1558
Copy link
Collaborator

@bhilbert4 Great, that sounds like a much more tractable data volume in family with the other modes.

Another way of looking at the issue: which approach will make it easier for a user to run the notebook on data from some arbitrary other program? Is the resampling onto a common grid something that could be achieved with just a few commented-out options in the image3 parameters block? If they only have one SW and one LW filter of data will the notebook fail? Is the method for getting two SW filter outputs to match the same as getting a LW and a SW filter to match?

@bhilbert4
Copy link
Collaborator Author

I'm not sure it's really possible to have this notebook act as a push-button way for all users to process all data from an arbitrary program. Some programs have a single SW/LW filter pair, while others have multiple pairs. So users will always have to think a little bit when trying to use this notebook with their data. Or at the very least, they may need a separate copy of the notebook for each filter pair they have.

Resampling two datasets onto the same pixel grid is the same process whether you're talking about two SW filters, or a LW and a SW. Outside of setting parameter values in the resample step, there is one line of extra code needed. That's to save the gWCS from a first mosaic into an asdf file. This asdf file is then fed to the resampling of the second filter's data via the output_wcs parameter. So I could set up this notebook to run the LW data and save the asdf file. Then when calling image3 on the SW data I could add a note saying that if the user wants to resample onto the same grid, they just need to uncomment the output_wcs parameter.

I'd probably add some text at that point mentioning that if the user has multiple SW filters they want on the same grid, they can use the same process. I think that makes more sense than changing the notebook to work on generic "filter 1" and "filter 2" data rather than SW and LW, even if the former case would cover more arbitrary use cases.

@drlaw1558
Copy link
Collaborator

Sounds good to me.

@bhilbert4
Copy link
Collaborator Author

@drlaw1558 @rizeladiaz I think this is ready for review.

Copy link
Collaborator

@drlaw1558 drlaw1558 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • In the CRDS setup cell, we should not set an explicit CRDS version; this should be commented out (i.e., an option for users to explore). Correct CRDS version handling to match the pipeline software use will be done on the CRDS backend now, and doing this in the notebook makes it harder to run the notebook with more-recent pipeline versions.
  • Likewise, remove the line printing the context from this cell since the keyword may be undefined. The CRDS version used will be printed in the log for each pipeline step.

@bhilbert4
Copy link
Collaborator Author

@drlaw1558 I'm unable to merge because of the failing tests (which are failing because of the memory issue). Do you have permissions to merge a PR where all the tests are not passing?

@drlaw1558
Copy link
Collaborator

Hm, doesn't look like I can either. Passing this over to @sosey

@drlaw1558
Copy link
Collaborator

drlaw1558 commented Nov 15, 2024

@bhilbert4 A few additional things that I just encountered:

  • A number of the print statements in the Demo Setup section aren't wrapped inside the 'if demo_mode' statements, meaning that they cause crashes if running outside of demo mode.
  • Looks like the notebook doesn't know how to handle the case of only SW or LW data provided, rather than both being present. Would there be an easy way to catch this and only process whichever inputs were provided?
  • Lvl3 association creation also crashes since the 'program' keyword is undefined outside of demo mode; this could be accessed programmatically from the file headers.

@bhilbert4
Copy link
Collaborator Author

@drlaw1558 I made updates so that the print statements in the demo setup are only called if demo_mode is true. I also separated the sw and lw statements such that when the user provides data, they can provide only sw, only lw, or both, and the notebook completes without errors.

@drlaw1558
Copy link
Collaborator

Looks good, thanks @bhilbert4 !

@mgough-970 mgough-970 merged commit 1512aec into spacetelescope:main Nov 18, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants