Skip to content

Initial commit of MIRI MRS notebook #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jul 10, 2024
Merged

Initial commit of MIRI MRS notebook #3

merged 28 commits into from
Jul 10, 2024

Conversation

drlaw1558
Copy link
Collaborator

This PR is for the initial commit of the MIRI MRS pipeline notebook.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:37Z
----------------------------------------------------------------

Do we want to provide a common format for the header information block here that more closely matches what is over in JDAT? see https://github.com/spacetelescope/jdat_notebooks/blob/main/notebooks/MIRI_LRS_spectral_extraction/miri_lrs_advanced_extraction_part1.ipynb as an example. I like the addition of the link to this specific notebook, if this notebook has a name change or deprecation, the users old link will break, and we'll want to think if there is a way to help them get to the right one.


drlaw1558 commented on 2024-04-24T21:57:44Z
----------------------------------------------------------------

I took at look at the JDAT format; while I don't want to follow that precisely with having a couple pages of text before the notebook gets started I have made some boldface bullets before each text block to call out more clearly what each is for. I've changed to just point toward the notebook repository instead of the notebook file itself- it should be fairly easy to find from the top level and would be more robust against future changes. Main aim with this is in the case that the community starts modifying and making their own versions (which we want) to have a clear place where any future users of said notebook could find any updates to the original.

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:38Z
----------------------------------------------------------------

This configuration section needs to be updated so that astroquery is used to revtrieve the files used in the notebook by default. Variables within the notebook can allow for the users setting custom locations for saving their data or using local data instead of making the astroquery call. Similar for the CRDS cache, use and env HOME reference


drlaw1558 commented on 2024-04-24T22:12:08Z
----------------------------------------------------------------

I've revised this to use a 'demo_mode' argument which is TRUE by default. In this case it overrides all of the custom user-local paths, and downloads data from Box to work on into a local directory adjacent to the notebook. This should do what we need for CI. Box file is currently staged in my personal Box folder- we should figure out where to move that to longer term.

I've commented out CRDS path information too- that should be set already as part of any pipeline install, so it's only in the notebook for convenience if someone wanted to change it around.

I've made a change to paths as well to use a default home dir instead of /Users/dlaw/ . I'm not super keen on that since it makes the line a lot clunkier and harder to read though, especially since by definition that's the line that folks need to modify in order to run things on their own data. Maybe there's a cleaner syntax.

sosey commented on 2024-05-06T18:08:52Z
----------------------------------------------------------------

There is already an STScI solution for using Box that does not involve personal box folders that we can use. However, I am more worried that we should not use box specifically for these notebook examples. As they are directly tied to pipeline processing, and the input data may change in form and content following build deliveries in ways that would require us pulling the data from the archive again and restaging on box. I feel it would be better to pull the data from the archive directly instead. I'll suggest some alternatives.

sosey commented on 2024-05-10T20:42:54Z
----------------------------------------------------------------

I made updates that pull all the data from the archive, first by query and then download. See the notebook here https://github.com/spacetelescope/jwst-pipeline-notebooks/blob/demo-example/notebooks/MIRI/JWPipeNB-MIRI-MRS.ipynb

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:39Z
----------------------------------------------------------------

Line #7.    # Set CRDS context (if overriding; otherwise will use latest context by default)

Do we want to include more information on why or when to override the context?


drlaw1558 commented on 2024-04-24T22:12:19Z
----------------------------------------------------------------

Added some words.

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:40Z
----------------------------------------------------------------

Line #21.    input_dir = '/Users/dlaw/FlightData/APT1523/data/Obs03/uncal/'

all these types of lines need to be more generic


drlaw1558 commented on 2024-04-24T22:13:11Z
----------------------------------------------------------------

As above; made more generic, but at the cost of readability. Not sure that's a good trade right now given this line is only relevant when users start modifying paths anyway, but let's see.

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:40Z
----------------------------------------------------------------

For notebook creators, we should try and limit the number of files being pulled and then processed, so that we have enough to inform the example but not so much to overwhelm the testing.


drlaw1558 commented on 2024-04-24T22:14:00Z
----------------------------------------------------------------

Agreed in principle, though we also want the data set to be realistic. In this case it's a minimal complete set of typical MRS observations.

sosey commented on 2024-05-10T20:44:43Z
----------------------------------------------------------------

I added a download and query to the example notebook on the demo-example branch, we can run this through CI as part of aceptance to make sure that github actions can handle the data volume and download time

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:41Z
----------------------------------------------------------------

worth also showing how to save the plot to file?


drlaw1558 commented on 2024-04-24T22:14:28Z
----------------------------------------------------------------

Done; some worry that it could clobber a users other local plot, but likely fairly low risk.

Copy link

review-notebook-app bot commented Apr 19, 2024

View / edit / reply to this conversation on ReviewNB

sosey commented on 2024-04-19T21:52:42Z
----------------------------------------------------------------

Line #17.        x1d=fits.open(file)

either close each file or use a context manager


drlaw1558 commented on 2024-04-24T22:15:51Z
----------------------------------------------------------------

Done, that got rid of a good number of warnings.

Copy link
Member

@sosey sosey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the review comments, there are still some changes to make so that it can run for the general pub and ci

Copy link
Collaborator Author

I took at look at the JDAT format; while I don't want to follow that precisely with having a couple pages of text before the notebook gets started I have made some boldface bullets before each text block to call out more clearly what each is for. I've changed to just point toward the notebook repository instead of the notebook file itself- it should be fairly easy to find from the top level and would be more robust against future changes. Main aim with this is in the case that the community starts modifying and making their own versions (which we want) to have a clear place where any future users of said notebook could find any updates to the original.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I've revised this to use a 'demo_mode' argument which is TRUE by default. In this case it overrides all of the custom user-local paths, and downloads data from Box to work on into a local directory adjacent to the notebook. This should do what we need for CI. Box file is currently staged in my personal Box folder- we should figure out where to move that to longer term.

I've commented out CRDS path information too- that should be set already as part of any pipeline install, so it's only in the notebook for convenience if someone wanted to change it around.

I've made a change to paths as well to use a default home dir instead of /Users/dlaw/ . I'm not super keen on that since it makes the line a lot clunkier and harder to read though, especially since by definition that's the line that folks need to modify in order to run things on their own data. Maybe there's a cleaner syntax.


View entire conversation on ReviewNB

Copy link
Collaborator Author

Added some words.


View entire conversation on ReviewNB

Copy link
Collaborator Author

As above; made more generic, but at the cost of readability. Not sure that's a good trade right now given this line is only relevant when users start modifying paths anyway, but let's see.


View entire conversation on ReviewNB

@drlaw1558
Copy link
Collaborator Author

I think I've got the dependencies all adjusted and I'm no longer getting immediate execution crashes, but it sure seems to be taking a long time in the 'Validate notebooks' step. What is this step actually doing, as it's prior to notebook execution?

@drlaw1558
Copy link
Collaborator Author

Looks like the latest error is that CI is running out of disk space with just 71 MB. Not sure what it started at? Where is the actual CI server based, is it shared between different projects, etc? We either need to clean up the machine and/or substantially increase the available space- even a minimum typical JWST data set is going to run to a couple of GB.

Copy link
Member

@sosey sosey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that the changes have been made with the follow-on reviews

Copy link
Collaborator

@rizeladiaz rizeladiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with changes

Copy link
Member

@sosey sosey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking this from merge until further discussions happen.

Copy link
Member

@sosey sosey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving for merge after discussion with the about the benefits of having an example notebooks in the main branch for ins to start building off of. Testing to understand larger runners will continue in parallel

@rizeladiaz rizeladiaz merged commit 61f0663 into main Jul 10, 2024
2 checks passed
@rizeladiaz
Copy link
Collaborator

Merging pull request to make notebook available to Instrument Branches

@rizeladiaz rizeladiaz deleted the mirimrs_apr12 branch July 10, 2024 22:54
mgough-970 pushed a commit that referenced this pull request Nov 14, 2024
Updated notebook plot and CRDS
mgough-970 pushed a commit that referenced this pull request Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants