Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provided cores differ to what's passed in the CLI #97

Open
vinisalazar opened this issue May 26, 2022 · 1 comment
Open

Provided cores differ to what's passed in the CLI #97

vinisalazar opened this issue May 26, 2022 · 1 comment

Comments

@vinisalazar
Copy link

vinisalazar commented May 26, 2022

Hi,

first, thank you for providing this profile. It's extremely useful.

I'm having a problem to launch multiple jobs at the same time. For example, I want to launch 5 jobs at the same time, each with 64 cores.

If I run snakemake --cores 64, I find that the jobs get launched sequentially rather than in parallel. I understand that this is because I requested a "maximum" of 64 cores, and thus if a job takes that up, I can only run one at a time.

Now, I wrote a function that is passed to the rules threads directive which multiples the workflow.cores by, say, 0.2. So I can pass snakemake --cores 320 and each rule will be allocated 64 cores. However, I am finding that somehow this is getting "squared". What happens is:

  • the Snakemake STDOUT (what is shown in the screen) shows the correct number of threads:
rule map_reads:
    input: output/mapping/H/catalogue.mmi, output/qc/merged/H_S003_R1.fq.gz, output/qc/merged/H_S003_R2.fq.gz
    output: output/mapping/bam/H/H_S003.map.bam
    log: output/logs/mapping/map_reads/H-H_S003.log
    jobid: 247
    benchmark: output/benchmarks/mapping/map_reads/H-H_S003.txt
    reason: Missing output files: output/mapping/bam/H/H_S003.map.bam
    wildcards: binning_group=H, sample=H_S003
    threads: 64
    resources: tmpdir=/tmp, mem_mb=149952


    minimap2 -t 64 > output/mapping/bam/H/H_S003.map.bam

Submitted job 247 with external jobid '35894421'.

That looks fine. I want this rule to be launched with 64 cores, and when I do this, 5 instances of the rule get launched at the same time.

When I open the job's SLURM log, however, I find that this value of 64 is passed as the "Provided cores" to the job, and thus is multiplied again by 0.2.

Contents of slurm-35894421.out:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 64
Rules claiming more threads will be scaled down.
Select jobs to execute...

rule map_reads:
    input: output/mapping/H/catalogue.mmi, output/qc/merged/H_S003_R1.fq.gz, output/qc/merged/H_S003_R2.fq.gz
    output: output/mapping/bam/H/H_S003.map.bam
    log: output/logs/mapping/map_reads/H-H_S003.log
    jobid: 247
    benchmark: output/benchmarks/mapping/map_reads/H-H_S003.txt
    reason: Missing output files: output/mapping/bam/H/H_S003.map.bam
    wildcards: binning_group=H, sample=H_S003
    threads: 13
    resources: tmpdir=/tmp, mem_mb=149952


    minimap2 -t 13 > output/mapping/bam/H/H_S003.map.bam

Even worse, my job is allocating 64 cores, but only using 13 (64 * 0.2, rounded). It's really weird to me that the Snakemake output shows the "correct" value, but the SLURM log shows the "real" value that was used, i.e. why do they differ?

I am trying to understand what am I doing wrong. Setting a breakpoint on my function used to get the number of threads, the workflow.cores variable is always what I pass to the command line (320), never what shows in the SLURM log.

I tried add a nodes: 5 or a jobs: 5 keys to the profile config.yaml but it doesn't do any good. Is there anything I can modify in the profile to make sure that I can launch as many parallel jobs as I can?

Please let me know what other information I can provide. Thank you very much.

Best,
V

@vinisalazar
Copy link
Author

Update:

I found a temporary fix by using Snakemake's max-threads option. However, I am still trying to understand what would be the best practice for my scenario, and why is my function call "doubled", that is, why does the SLURM log differs from the Snakemake output by showing the "Provided cores" as the number of threads passed to the rule (and the number of threads gets calculated again from the Provided cores). Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant