Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_ensemble_ler() crashes on remote distributed cluster #236

Open
tekknosol opened this issue Nov 16, 2021 · 2 comments
Open

run_ensemble_ler() crashes on remote distributed cluster #236

tekknosol opened this issue Nov 16, 2021 · 2 comments

Comments

@tekknosol
Copy link

Hi,
I am working on a remote cluster with multiple nodes created by future::plan(cluster). Running LakeEnsemblR 1.0.5 in this environment crashes when calling the model executables.
I figured this is due to the way do.call() is used in the function run_ensemble_ler(). According to the documentation of the future package its advised to specify the function as the object itself and not by name when using do:call().

So instead of the current code used in run_ensemble.R...

model_out <- setNames(
      lapply(model, function(mod_name) do.call(paste0(".run_", mod_name),
                                               run_model_args)),
      model
    )

... this problem can be solved by doing something like:

model_out <- setNames(lapply(model, function(mod_name){
      if(mod_name == "Simstrat"){
        do.call(.run_Simstrat, run_model_args)
      }else{if(mod_name == "GLM"){
                 do.call(.run_GLM, run_model_args)
              }}
    }), model)

Probably this if() cascade is not the most elegant solution and something like e.g. a switch() statement would be better.
Anyway, I am happy to submit a pull request if you provide some thoughts on how you would like to deal with it!

@tadhg-moore
Copy link
Collaborator

Hey, that is a good catch as we have not yet ran LakeEnsemblR in a cluster environment and that looks like a relatively straightforward adjustment, we would be happy to have a PR with a fix on that. Apologies for the delay in response.

@tekknosol
Copy link
Author

I realized meanwhile that some additional issues have to be solved to run LER on the cluster (at least in my case). I had to deactivate some parts of the code and I can only use the csv output mode because there seems to be a problem with the upstream NetCDF library.
But I am successfully using LER on the cluster now...

I'll try to figure it out more systematically and then open respective issues/PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants