Hi team,
I am using this as part of https://github.com/epi2me-labs/wf-transcriptomes/
I am able to run make batches, generating 0-48 batches, and the following clustering step failed.
The error message is slurmstepd: error: Detected 3526 oom_kill events in StepId=12206690.batch. Some of the step tasks have been OOM Killed.
But when I examine the log files, all job_level_0 output was generated, but most level_1 output not.
I tried to run the failed script from level_1.sh
isONclust2 cluster -x sahlin -v -Q -l clusters/isONcluster_0.cer -r clusters/isONcluster_1.cer -o clusters/isONcluster_49.cer ; sync
It showed segmentation fault (core dumped).
Loaded input batch from clusters/isONcluster_0.cer:
Batch number: 0
Batch range: [0,16883]
Depth: 0
Nr sequences: 16884
Nr bases: 50287830
Nr clusters: 1
Nr nontrivial clusters: 1
Minimizers in database: 22619
Loaded input batch from clusters/isONcluster_1.cer:
Batch number: 1
Batch range: [16884,39157]
Depth: 0
Nr sequences: 22274
Nr bases: 50286904
Nr clusters: 2
Nr nontrivial clusters: 2
Minimizers in database: 0
Generating consensus using spoa algorithm: semi-global
Clustering mode: sahlin
Segmentation fault (core dumped)
There are some were successfully run for level_1
isONclust2 cluster -x sahlin -v -Q -l clusters/isONcluster_8.cer -r clusters/isONcluster_9.cer -o clusters/isONcluster_53.cer ; sync
Loaded input batch from clusters/isONcluster_8.cer:
Batch number: 8
Batch range: [219434,255621]
Depth: 0
Nr sequences: 36188
Nr bases: 50286190
Nr clusters: 38
Nr nontrivial clusters: 38
Minimizers in database: 23082
Loaded input batch from clusters/isONcluster_9.cer:
Batch number: 9
Batch range: [255622,293538]
Depth: 0
Nr sequences: 37917
Nr bases: 50287047
Nr clusters: 33
Nr nontrivial clusters: 32
Minimizers in database: 0
Generating consensus using spoa algorithm: semi-global
Clustering mode: sahlin
Filtered out 0 input clusters smaller than 2.
Finished clustering!
Alignment invocation count: 0 (0%)
Consensus invocation count: 33 (100%)
Number of clusters larger than 1: 38
Output batch statistics:
Batch number: 8
Batch range: [219434,293538]
Depth: 1
Nr sequences: 74105
Nr bases: 100573237
Nr clusters: 38
Nr nontrivial clusters: 38
Minimizers in database: 24370
Output batch written to: clusters/isONcluster_53.cer
I noticed the minimizes is 0 for the right cluster, but not sure if this is related. This error caused then all subsequent issues. The file sizes seem small, and I have requested 16GB per core in a slurm management system.
I need some help to run this if you could kindly have a look at the issue.
Thanks a lot.
Hi team,
I am using this as part of https://github.com/epi2me-labs/wf-transcriptomes/
I am able to run make batches, generating 0-48 batches, and the following clustering step failed.
The error message is
slurmstepd: error: Detected 3526 oom_kill events in StepId=12206690.batch. Some of the step tasks have been OOM Killed.But when I examine the log files, all job_level_0 output was generated, but most level_1 output not.
I tried to run the failed script from level_1.sh
It showed segmentation fault (core dumped).
There are some were successfully run for level_1
I noticed the minimizes is 0 for the right cluster, but not sure if this is related. This error caused then all subsequent issues. The file sizes seem small, and I have requested 16GB per core in a slurm management system.
I need some help to run this if you could kindly have a look at the issue.
Thanks a lot.