Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crew_controller_slurm() causes error with targets 1.10.1 and crew 1.0.0 #52

Open
7 tasks done
Aariq opened this issue Feb 3, 2025 · 2 comments
Open
7 tasks done
Assignees

Comments

@Aariq
Copy link

Aariq commented Feb 3, 2025

Prework

  • Read and agree to the Contributor Code of Conduct and contributing guidelines.
  • Confirm that your issue is a genuine bug in the crew.cluster package itself and not a user error, known limitation, or issue from another package that crew.cluster depends on.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • Post a minimal reproducible example like this one so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • Readable: format your code according to the tidyverse style guide.

Description

Since updating to crew 1.0.0 this morning (yay, btw!) I've noticed errors and warnings suggesting that crew.cluster may need updating. Forgive me if this is already on your radar! These errors occur whether I'm on the HPC or not and happen before any workers get launched when running tar_make()

Reproducible example

targets::tar_dir({
  targets::tar_script({
    library(targets)
    library(crew)
    library(crew.cluster)
    
    controller_hpc <- crew.cluster::crew_controller_slurm(
      name = "hpc",
      workers = 2,
      options_cluster = crew.cluster::crew_options_slurm(
        script_lines = c(
          "#SBATCH --account kristinariemer",
          "module load R/4.4"
        ),
        log_output = "logs/crew_small_log_%A.out",
        log_error = "logs/crew_small_log_%A.err",
        memory_gigabytes_per_cpu = 5,
        cpus_per_task = 2, #total 10gb RAM
        time_minutes = 1200, # wall time for each worker
        partition = "standard"
      )
    )
    tar_option_set(
      packages = c("tibble"),
      controller = controller_hpc,
    )
    list(
      tar_target(x, tibble(x = 1:3))
    )
  })
  targets::tar_make()
})
#> name (in crew_client()) was deprecated on 2023-01-14 (version 0.10.2.9002). Alternative: none.
#> workers (in crew_client()) was deprecated on 2023-01-13 (version 0.10.2.9002). Alternative: none.
#> retry_tasks was deprecated on 2023-01-13 (version 0.10.2.9002). Alternative: none.
#> ✖ empty pipeline [1.287 seconds]
#> 
#> ── Debugging ───────────────────────────────────────────────────────────────────
#> 
#> ── How to ──────────────────────────────────────────────────────────────────────
#> 
#> ── Last error message ──────────────────────────────────────────────────────────
#> 
#> ── Last error traceback ────────────────────────────────────────────────────────
#> Error:
#> ! targets::tar_make() error
#>     • tar_errored()
#>     • tar_meta(fields = any_of("error"), complete_only = TRUE)
#>     • tar_workspace()
#>     • tar_workspaces()
#>     • Debug: https://books.ropensci.org/targets/debugging.html
#>     • Help: https://books.ropensci.org/targets/help.html
#>     argument is of length zero
#>     base::tryCatch(base::withCallingHandlers({ NULL base::saveRDS(base::do.c...
#>     tryCatchList(expr, classes, parentenv, handlers)
#>     tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), na...
#>     doTryCatch(return(expr), name, parentenv, handler)
#>     tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
#>     tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>     doTryCatch(return(expr), name, parentenv, handler)
#>     base::withCallingHandlers({ NULL base::saveRDS(base::do.call(base::do.ca...
#>     base::saveRDS(base::do.call(base::do.call, base::c(base::readRDS("/var/f...
#>     base::do.call(base::do.call, base::c(base::readRDS("/var/folders/wr/by_l...
#>     (function (what, args, quote = FALSE, envir = parent.frame()) { if (!is....
#>     (function (targets_function, targets_arguments, options, envir = NULL, s...
#>     tryCatch(out <- withCallingHandlers(targets::tar_callr_inner_try(targets...
#>     tryCatchList(expr, classes, parentenv, handlers)
#>     tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>     doTryCatch(return(expr), name, parentenv, handler)
#>     withCallingHandlers(targets::tar_callr_inner_try(targets_function = targ...
#>     targets::tar_callr_inner_try(targets_function = targets_function, target...
#>     do.call(targets_function, targets_arguments)
#>     (function (pipeline, path_store, names_quosure, shortcut, reporter, seco...
#>     crew_init(pipeline = pipeline, meta = meta_init(path_store = path_store)...
#>     self$run_crew()
#>     self$iterate()
#>     self$process_target(queue$dequeue())
#>     .subset2(self, "run_target")(target)
#>     if_any(target_should_run_worker(target), self$run_worker(target), self$r...
#>     self$run_worker(target)

Created on 2025-02-03 with reprex v2.1.1

Diagnostic information

Session info
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-apple-darwin20
#> Running under: macOS 15.2
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/Phoenix
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.5       cli_3.6.3         knitr_1.49        rlang_1.1.5      
#>  [5] xfun_0.50         processx_3.8.5    targets_1.10.1    data.table_1.16.4
#>  [9] glue_1.8.0        backports_1.5.0   htmltools_0.5.8.1 ps_1.8.1         
#> [13] rmarkdown_2.29    tibble_3.2.1      evaluate_1.0.3    base64url_1.4    
#> [17] fastmap_1.2.0     yaml_2.3.10       lifecycle_1.0.4   compiler_4.4.1   
#> [21] codetools_0.2-20  igraph_2.1.4      fs_1.6.5          pkgconfig_2.0.3  
#> [25] rstudioapi_0.17.1 digest_0.6.37     R6_2.5.1          tidyselect_1.2.1 
#> [29] reprex_2.1.1      pillar_1.10.1     callr_3.7.6       magrittr_2.0.3   
#> [33] tools_4.4.1       withr_3.0.2       secretbase_1.0.4
@Aariq
Copy link
Author

Aariq commented Feb 3, 2025

Oh, and I did confirm that this doesn't happen with just a local controller, and I couldn't figure out where those deprecated argument warnings were popping up from or if they are necessarily related to the error.

@wlandau
Copy link
Owner

wlandau commented Feb 3, 2025

Yes, migrating to mirai 1.0.0 was tricky, and I'm in the middle of upgrading reverse dependencies. I'll release crew.cluster and crew.aws.batch on Thursday. In the meantime, the development versions of these plugins should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants