Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge for Lowering the Barriers Paper #41

Open
wants to merge 57 commits into
base: next
Choose a base branch
from

Conversation

ChristinaB
Copy link

@ChristinaB ChristinaB commented Oct 29, 2018

These folders contain tutorials for the Lowering the Barriers paper (Enabling Collaborative Numerical Modeling in Earth Sciences using Knowledge Infrastructure). Copies of these versions are also available in the HydroShare resource: Bandaragoda, C., A. M. Castronova, J. Phuong, E. Istanbulluoglu, S. S. Nudurupati, R. Strauch, N. Gasparini, E. Hutton, G. Tucker, D. Hobley, K. Barnhart, J. Adams (2018). Knowledge Infrastructure for Enabling Collaborative Numerical Modeling in Earth Sciences: Landlab Notebooks, HydroShare, http://www.hydroshare.org/resource/70b977e22af544f8a7e5a803935c329c

kbarnhart and others added 30 commits March 7, 2018 18:34
Uploaded latest files as of today after consultation with Christina.
merging from 9/12 updates
Fixed a few typos and removed reference to saving arrays as variables.
Update Replicate_Landslide_model_for_fire.ipynb
modified landslide tutorial for fisher creek
overland flow tut: code style, legends in figures, some text clarified
@ChristinaB
Copy link
Author

@mcflugen Is this ready to merge? I want to try using Binder. Thanks!

@kbarnhart
Copy link
Collaborator

@ChristinaB a couple of comments on this PR.

  1. There are a bunch of files in this PR that are not jupyter notebooks and things required to run the jupyter notebooks (e.g. the three PDFs inside the cyberinfrastructure_intro folder). I'd recommend you remove these files from this PR.

  2. There are a bunch of notebooks that have nothing to do with the lowering the barriers paper (e.g. the network model grid notebooks that are currently in development). I'd recommend you remove these files from this PR.

  3. Within the tutorials folder, it would be nice to have 1 folder called "lowering_the_barriers" in which you have folders for each of your example notebooks. This would make it easy for a user to find the notebooks associated with this PR.

  4. The tests don't pass. This needs to be addressed before this PR can be merged.

@mcflugen
Copy link
Member

mcflugen commented Oct 31, 2018

@ChristinaB

Some thoughts before merging.

  • Tests need to pass
  • Remove cyberinfrastructure_intro folder (it only contains PDFs, to notebook tutorials)
  • Remove all files generated by the tutorial notebooks (like the json files?)
  • Possibly remove the ogh_newmexico folder as isn't really a landlab tutorial (?)
  • Remove Python scripts that were saved from notebooks (e.g. explore_routing_tutorial.py)
  • Remove hard-coded paths (e.g. /home/jovyan/work/notebooks/data)
  • The NetworkModelGrid notebooks should probably have some description with them. Do they even belong in the pull request?
  • observatory_gridmet_newmexico doesn't appear to be a landlab tutorial. Remove?

I wonder if some of these tutorials will not be able to run on Travis because of their dependence on HydroShare. If so, perhaps those tutorials shouldn't be in this repository.

I reformatted some of the notebooks but I'm unable to run them as I don't have some of the dependencies. All of the dependencies should be conda-installable and listed in requirements.txt. I think the main culprits are utilities and ogh.

@kbarnhart Perhaps each folder should have its own requirements file? Each notebook could then be run in its own environment built from that file?

@kbarnhart
Copy link
Collaborator

I can confirm that the network model grid notebooks definitely do not belong in this pull request.

@mcflugen, I think your notebook-level requirements.txt idea is a good. If I recall the page you recommended I look at when I worked on fixing the notebook testing had a way to use these notebook-level requirements.txt files if they existed, or just default to one at an upper level of the repository if necessary.

One additional requirement will be geopandas (which is what the past tests were failing on).

A final concern is related to notebook run time.

@ChristinaB , do you know how long it typically takes for each of these notebooks to run?

@mcflugen we may want to look into shortening some of the run times of the longer-running notebooks in this repo. I know that some of them are pretty long (e.g. ~10 m). This way we can make sure that we have plenty of travis-testing-time to add more notebooks in this PR and in the future.

@mcflugen
Copy link
Member

@kbarnhart Yes, we definitely want to make sure the notebooks run quickly. I would say, less than a a minute or two - max.

I'll look more into notebook-level requirements. I think that would be useful.

@mcflugen
Copy link
Member

@kbarnhart, @ChristinaB I just removed the NetworkModelGrid tutorial.

@kbarnhart
Copy link
Collaborator

@mcflugen this is the blog post I was thinking of. It describes custom environments.

@saisiddu
Copy link

Just to add to this conversation, the 'reuse_ecohydrology_gridhydromet.ipynb' will take at least 45 mins to run. It also uses files produced by 'ogh_mexico'.

@kbarnhart
Copy link
Collaborator

Sounds like the folder ogh_mexico should live inside the folder associated with the reuse_ecohydrology_gridhydromet.ipynb notebook so the dependency hierarchy is clear.

@ChristinaB
Copy link
Author

We are way over the 1-2 minute requirement on all of the tutorials. Is this necessary? or just preferred (I would prefer it...)

@kbarnhart
Copy link
Collaborator

With a 45 minute run time, it is unlikely that a notebook can be tested (as the 50 minute window applies to all of the software install and test running). Is there any way to make the runtime shorter?

@saisiddu
Copy link

@kbarnhart Yes, the runtime can be drastically reduced by changing 'n'. Is there a way to let the code know that it's testing time so that I can insert a flag that results in a shorter 'n' for testing?

@mcflugen
Copy link
Member

mcflugen commented Oct 31, 2018

@kbarnhart It does but it looks like that tutorial is in two places.

  1. observatory_gridmet/observatory_gridmet_newmexico.ipynb
  2. ecohydrology_gridhydromet/ogh_newmexico/observatory_gridmet_newmexico.ipynb

That should definitely be cleaned up.

@saisiddu Could the reuse_ecohydrology_gridhydromet tutorial be rewritten to not be dependent on on the observatory_gridmet_newmexico notebook? Maybe just include the necessary files that result from it?

@mcflugen
Copy link
Member

The run time really does have to be fairly short. We have to test all of the notebooks on both Mac and Linux, for Python 2.7, and 3.6 (and soon 3.7). As @kbarnhart says, this all has to be done in a little less than an hour.

@saisiddu
Copy link

saisiddu commented Oct 31, 2018

@mcflugen: reuse_ecohydrology_gridhydromet notebook is currently independent. It only uses the outputs (previously created by observatory_gridmet_newmexico notebook).

@mcflugen
Copy link
Member

@saisiddu

Is there a way to let the code know that it's testing time so that I can insert a flag that results in a shorter 'n' for testing?

Not really. I would suggest changing the default to be something small so that the notebook runs quickly and then having a user change n to a larger number by hand if they want to do a larger run.

@mcflugen
Copy link
Member

@ChristinaB If you want to try out your notebooks in binder, I think you should just be able to point binder to your fork of the tutorials. I don't think it requires you to merge the notebooks.

@saisiddu
Copy link

@mcflugen: Yep, that was what I was thinking. I will make the default a small number. Also, for the `observatory_gridmet_newmexico' notebook, I will comment out (or convert the cells to 'raw'). Therefore, if someone is motivated, they can read the instructions to un-comment them and run them.

@ChristinaB
Copy link
Author

My current suggestion is that we keep the full build out tutorials in HydroShare for launching in classroom use. We can keep a test version in landlab. Next week I will have time tol strip down the classroom version, address the above comments, and then do another PR. @mcflugen can you help get over the OGH hurdle? This may take another conversation with @jphuong Thanks @kbarnhart and @saisiddu I love Landlab!

@mcflugen
Copy link
Member

@ChristinaB No need for another pull request. Just continue to push commits to this one.

I thought that ogh was on conda-forge but, if it is, it must have another name. @jphuong?

It looks like it's on PyPI though. We may just have to install it with pip rather than conda. However, it would be better (for us, anyway) to have it on conda-forge. Any reason not to have it there as well? if not, I would be happy to submit it.

@jphuong
Copy link

jphuong commented Oct 31, 2018

@ChristinaB @mcflugen ogh is in conda-forge, but it is only available for Python 3.6 (https://github.com/conda-forge/ogh-feedstock/blob/master/recipe/meta.yaml). If your pip or anaconda were installed for Python 2, you may need to explicitly mention conda install ogh for a python3.6 kernel.

One more thing, ogh v.01.11 recently encountered some issues because Matplotlib migrated to v3.0, which has also dropped version compatibility and support for Python 2.x. There are still a few other bugs that need to be fixed, but ogh v.0.1.11 needs to versioned with Matplotlib v.2.2.3 in a Python 3.6+ environment.

@mcflugen
Copy link
Member

Thanks @jphuong. I don't know how I missed that.

Until we drop landlab support for Python 2, we require the tutorials to work with both 2.7 and 3.6+.

I guess we could allow some notebooks to only run with Python 3. My inclination, though, is to require them to work with both 2 and 3. @kbarnhart what do you think?

@jphuong
Copy link

jphuong commented Nov 1, 2018

@mcflugen Just to save you some time, this will be the pip installation protocol from command line.
OGH actually will work in Python2 environments, but a few months ago we had only configured conda install setup just for Python3.6.

Python2
python -m pip install ogh sympy pygraphviz libgdal gdal ncurses matplotlib==2.2.3 -U

Python3
python3.6 -m pip install ogh sympy pygraphviz libgdal gdal ncurses matplotlib==2.2.3 -U

@mcflugen
Copy link
Member

mcflugen commented Nov 1, 2018

@jphuong I just had a closer look and you've built OGH as noarch on conda-forge so it should install just fine with Python 2. I think I must have just made a typo when I was searching for it before (yes, just three letters and I still made a typo 🤦‍♂️ ).

@kbarnhart kbarnhart changed the base branch from release to next October 14, 2019 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants