diff --git a/DESCRIPTION b/DESCRIPTION index 2cdd4d3..599e74c 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: clustermq Title: Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque) -Version: 0.9.1 +Version: 0.9.2 Authors@R: c( person('Michael', 'Schubert', email='mschu.dev@gmail.com', role = c('aut', 'cre', 'cph'), diff --git a/NEWS.md b/NEWS.md index fb6f3a5..ef8fe19 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,8 +1,9 @@ -# git head +# clustermq 0.9.2 * Fix a bug where SSH proxy would not cache data properly (#320) * Fix a bug where `max_calls_worker` was not respected (#322) * Local parallelism (`multicore`, `multiprocess`) again uses local IP (#321) +* Pool `info()` now also returns current worker and number of calls # clustermq 0.9.1 diff --git a/README.md b/README.md index 01a3b79..8a5c547 100644 --- a/README.md +++ b/README.md @@ -146,7 +146,7 @@ Use [`batchtools`](https://github.com/mllg/batchtools) if you: * don't mind there's no load-balancing at run-time Use [Snakemake](https://snakemake.readthedocs.io/en/latest/) or -[`drake`](https://github.com/ropensci/drake) if: +[`targets`](https://github.com/ropensci/targets) if: * you want to design and run a workflow on HPC @@ -154,6 +154,15 @@ Don't use [`batch`](https://cran.r-project.org/web/packages/batch/index.html) (last updated 2013) or [`BatchJobs`](https://github.com/tudo-r/BatchJobs) (issues with SQLite on network-mounted storage). +Questions +--------- + +You are welcome to ask questions if something is not clear in the [User +guide](https://mschubert.github.io/clustermq/articles/userguide.html). + +Please use the [Github +Discussions](https://github.com/mschubert/clustermq/discussions) for this. + Contributing ------------ @@ -162,8 +171,6 @@ to coordinate development of `clustermq`. Contributions are welcome and they come in many different forms, shapes, and sizes. These include, but are not limited to: -* Questions: You are welcome to ask questions if something is not clear in the - [User guide](https://mschubert.github.io/clustermq/articles/userguide.html). * Bug reports: Let us know if something does not work as expected. Be sure to include a self-contained [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) and set diff --git a/vignettes/userguide.Rmd b/vignettes/userguide.Rmd index eedacf4..dc37e3a 100644 --- a/vignettes/userguide.Rmd +++ b/vignettes/userguide.Rmd @@ -35,7 +35,7 @@ Install the `clustermq` package in R from CRAN. This will automatically detect if [ZeroMQ](https://github.com/zeromq/libzmq) is installed and otherwise use the bundled library: -```r +```{r eval=FALSE} # Recommended: # If your system has `libzmq` installed but you want to enable the worker crash # monitor, set the following environment variable to enable compilation of the @@ -47,7 +47,8 @@ install.packages('clustermq') Alternatively you can use the `remotes` package to install directly from Github. Note that this version needs `autoconf`/`automake` for compilation: -```r +```{r eval=FALSE} +# Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0) # install.packages('remotes') remotes::install_github('mschubert/clustermq') ``` @@ -59,6 +60,7 @@ However, [feedback is very welcome](https://github.com/mschubert/clustermq/issues/new). ```{r eval=FALSE} +# Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0) # install.packages('remotes') remotes::install_github('mschubert/clustermq', ref="develop") ``` @@ -69,6 +71,7 @@ Choose your preferred parallelism using: ```{r eval=FALSE} options(clustermq.scheduler = "your scheduler here") +# this may require additional setup, for details see below ``` There are three kinds of schedulers: @@ -106,7 +109,7 @@ To set up a scheduler explicitly, see the following links: * [SGE](#SGE) - *should work without setup* * [SLURM](#SLURM) - *should work without setup* * [PBS](#PBS)/[Torque](#TORQUE) - *needs* `options(clustermq.scheduler="PBS"/"Torque")` -* if you want another scheduler, [open an +* you can suggest another scheduler by [opening an issue](https://github.com/mschubert/clustermq/issues/new) Default submission templates [are @@ -212,7 +215,7 @@ Q(fx, x=1:3, export=list(y=10), n_jobs=1) ``` If we want to use a package function we need to load it on the worker using the -`pkg` argument or referencing it with `package_name::`: +`pkg` argument or referencing it with `package_name::`. ```{r} fx = function(x) { @@ -259,29 +262,22 @@ register(DoparParam()) # after register_dopar_cmq(...) bplapply(1:3, sqrt) ``` -### With `drake` +### With `targets` -The [`drake`](https://github.com/ropensci/drake) package enables users to +The [`targets`](https://github.com/ropensci/targets) package enables users to define a dependency structure of different function calls, and only evaluate them if the underlying data changed. -> drake — or, Data Frames in R for Make — is a general-purpose workflow manager -> for data-driven tasks. It rebuilds intermediate data objects when their -> dependencies change, and it skips work when the results are already up to -> date. Not every runthrough starts from scratch, and completed workflows have -> tangible evidence of reproducibility. drake also supports scalability, -> parallel computing, and a smooth user experience when it comes to setting up, -> deploying, and maintaining data science projects. +> The `targets` package is a [Make](https://www.gnu.org/software/make/)-like +> pipeline tool for statistics and data science in R. The package skips costly +> runtime for tasks that are already up to date, orchestrates the necessary +> computation with implicit parallel computing, and abstracts files as R +> objects. If all the current output matches the current upstream code and +> data, then the whole pipeline is up to date, and the results are more +> trustworthy than otherwise. -It can use `clustermq` to perform calculations as jobs: - -```{r eval=FALSE} -library(drake) -load_mtcars_example() -# clean(destroy = TRUE) -# options(clustermq.scheduler = "multicore") -make(my_plan, parallelism = "clustermq", jobs = 2, verbose = 4) -``` +It can use `clustermq` to [perform calculations as +jobs](https://books.ropensci.org/targets/hpc.html#clustermq). ## Options