Allow scripts to be run instead of notebooks #93

TeaganKing · 2024-04-10T16:47:14Z

We'd like to be able to run a script rather than just a notebook with CUPiD.

A good test of this would be to run a sea ice script instead of a sea ice notebook.

This will require saving figures generated from the notebook and reading them in with a standard naming convention to an html file that could show the figures. See #16 for the html component of this.

mnlevy1981 · 2024-04-18T17:20:03Z

@dabail10 started to work on this and ran into a major stumbling block -- the ploomer script runner doesn't offer an obvious way to change environments before running the script. Dave's example was being run in cupid-dev, which doesn't have the necessary analysis packages installed. Lev will dig into the ploomer documentation and then I'll work with Dave to get a fix in place.

rmshkv · 2024-04-18T17:35:35Z

As a first pass, having looked through the Ploomber documentation, it doesn't look like there's a supported way to specify an environment for the ScriptRunner task. I'd recommend starting with looking at the documentation for ScriptRunner. If we wanted to go the route of putting in a PR to Ploomber or implementing our own ScriptRunner class, it actually looks like it might be a pretty straightforward update to their code, but that might be out of scope, in which case we should put together a more hacky solution in our own code.

TeaganKing · 2024-05-14T22:48:23Z

Just for reference, the function which includes ScriptRunner is create_ploomber_script_task.

In the ScriptRunner documentation, it looks like we should be able to specify the environment in the configuration file using something like this, adapted for the script that you have (and I don't entirely remember the structure that you had in the config file thus far):

tasks:
  - source: script.py
  - product: output.txt
  - env: cupid-analysis

So, specifying 'env' in an additional line should ideally work to allow for using the cupid-analysis environment. Maybe we can test it out tomorrow?

rmshkv · 2024-05-14T22:54:15Z

Hmm, maybe I missed something - could you point me to where in the documentation or Ploomber source code that feature is mentioned?

TeaganKing · 2024-05-14T23:08:38Z

Apologies, I may have jumped ahead without enough research on this... I saw this in an example, but after another look, I'm now thinking that the example was not accurate and it might not actually work in the format that I commented in the code snippet above. @rmshkv did you already have a particular place in the Ploomber code you were looking at to update?

TeaganKing · 2024-05-15T16:45:56Z

It looks like the kernelspec_name in NotebookRunner is set roughly here. Perhaps we could add an environment to this part of ScriptRunner?

Or, if we prefer to do something from our own side, could we possibly try something like this from within the script? I'm not sure what the best method is, but wanted to toss out some ideas.

import subprocess
subprocess.run("conda activate cupid-analysis", shell=True, check=True, executable='/bin/bash')

For reference, I'm including the NotebookRunner documentation and ScriptRunner documentation.

TeaganKing · 2024-05-15T19:14:27Z

From @mnlevy1981 , we could also possibly use something like os.system("conda run -n cupid-analysis python script.py") in run.py to replace line 254 (call to create_ploomber_script_task.

rmshkv · 2024-05-15T19:22:06Z

Some rough thoughts: I haven't tried any of these particular solutions, but in general, I think the first attempt should be along the lines of modifying Ploomber's code or implementing our own script runner task code if we don't want to contribute to that code base. It'd be nice to keep the interface of specifying the environment in config.yml rather than having to add code to the script itself, and it would also be best to keep the script running mechanism as a Ploomber task, since that task creation step isn't actually where the script code is being run (that happens later when the DAG is built, which takes into account dependencies between tasks if those exist, including between script and notebook tasks).

TeaganKing · 2024-08-20T19:56:39Z

I'm working on updates to ploomber to address this.

TeaganKing · 2024-08-20T21:49:11Z

One other note, however: with the context of a potential new environment that includes not only the dependencies for running the CUPiD commands but also for executing notebooks (and scripts), perhaps this issue would be resolved by a combined environment anyways. That said, the feature of being able to specify a non-default kernel is still useful, although perhaps not as important as getting the script functionality initially implemented.

TeaganKing added framework common utility labels Apr 10, 2024

mnlevy1981 assigned mnlevy1981 and rmshkv Apr 18, 2024

TeaganKing linked a pull request May 15, 2024 that will close this issue

Testing out external scripts #99

Closed

mnlevy1981 mentioned this issue Jul 30, 2024

Key metrics #118

Merged

7 tasks

TeaganKing mentioned this issue Jul 31, 2024

A few renaming suggestions #106

Closed

TeaganKing unassigned rmshkv Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow scripts to be run instead of notebooks #93

Allow scripts to be run instead of notebooks #93

TeaganKing commented Apr 10, 2024

mnlevy1981 commented Apr 18, 2024

rmshkv commented Apr 18, 2024

TeaganKing commented May 14, 2024 •

edited

Loading

rmshkv commented May 14, 2024

TeaganKing commented May 14, 2024

TeaganKing commented May 15, 2024

TeaganKing commented May 15, 2024 •

edited

Loading

rmshkv commented May 15, 2024

TeaganKing commented Aug 20, 2024 •

edited

Loading

TeaganKing commented Aug 20, 2024 •

edited

Loading

Allow scripts to be run instead of notebooks #93

Allow scripts to be run instead of notebooks #93

Comments

TeaganKing commented Apr 10, 2024

mnlevy1981 commented Apr 18, 2024

rmshkv commented Apr 18, 2024

TeaganKing commented May 14, 2024 • edited Loading

rmshkv commented May 14, 2024

TeaganKing commented May 14, 2024

TeaganKing commented May 15, 2024

TeaganKing commented May 15, 2024 • edited Loading

rmshkv commented May 15, 2024

TeaganKing commented Aug 20, 2024 • edited Loading

TeaganKing commented Aug 20, 2024 • edited Loading

TeaganKing commented May 14, 2024 •

edited

Loading

TeaganKing commented May 15, 2024 •

edited

Loading

TeaganKing commented Aug 20, 2024 •

edited

Loading

TeaganKing commented Aug 20, 2024 •

edited

Loading