is there a limit to the tree length? Some commands not executing despite similar ones are #38

stemangiola · 2020-10-04T07:39:02Z

I have a makeflow file with ~17K commands. Some of them at the root of the tree

dev/test_simulation/input__slope_0.5__foreignProp_0.8__S_30__whichChanging_1__run_2.rds:
        Rscript ~/PhD/deconvolution/ARMET/dev/test_simulation_makeflow_pipeline/create_input.R 0.5 0.8 30 1 2 dev/test_simulation/input__slope_0.5__foreignProp_0.8__S_30__whichChanging_1__run_2.rds

Are not executed for some reason, while other combination of parameters are. I don't understand why.

The text was updated successfully, but these errors were encountered:

stemangiola · 2020-10-04T07:43:20Z

makefile_test_simulation.makeflow.makeflowlog.zip

makefile_test_simulation.zip

stemangiola · 2020-10-05T03:56:29Z

As you can see I have few holes in my benchmark

The workflow hangs and does not submit any more jobs, and if I interrupt and start again it hangs on starting workflow

btovar · 2020-10-05T10:35:12Z

Stefano, I'm going through your logs now... Ben

…

On Sun, Oct 4, 2020 at 11:56 PM Stefano Mangiola ***@***.***> wrote: As you can see I have few holes in my benchmark [image: image] <https://user-images.githubusercontent.com/7232890/95038823-d6401480-071a-11eb-8a41-694da25d81e7.png> The workflow hangs and does not submit any more jobs, and if I interrupt and start again it hangs on starting workflow — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXQMYXP2U2FBWS5LDCBG63SJE7XTANCNFSM4SDRDF3A> .

btovar · 2020-10-05T11:24:43Z

Stefano, which command line are you using to run the workflows?

When you say you are changing parameters, are you also changing cores, memory, etc., or only parameters of your tasks?

stemangiola · 2020-10-05T11:56:17Z

Each block of tests depending on what algorithm is tested is run with different resources

here the command

makeflow -T slurm -j 100  --do-not-save-failed-output test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow

btovar · 2020-10-05T12:21:46Z

Could you send me the log.out file from:
makeflow -T slurm -j 100 --do-not-save-failed-output test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow > log.out 2>&1

stemangiola · 2020-10-05T12:27:39Z

parsing dev/test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow...
local resources: 32 cores, 193277 MB memory, 148722940 MB disk
max running remote jobs: 100
max running local jobs: 100
checking dev/test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow for consistency...
dev/test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow has 38880 rules.
recovering from log file dev/test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow.makeflowlog...
checking for old running or failed jobs...
checking files for unexpected changes... (use --skip-file-check to skip this step)
starting workflow....

and hangs forever

btovar · 2020-10-05T12:29:14Z

I forgot to add the -dall debug flag, sorry about that:

makeflow -dall -T slurm -j 100 --do-not-save-failed-output test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow > log.out 2>&1

stemangiola · 2020-10-05T12:35:52Z

log.zip

btovar · 2020-10-05T13:02:00Z

Stefano, could you also send me dev/test_simulation_makeflow_pipeline/makefile_test_simulation.makeflow.batchlog?

stemangiola · 2020-10-07T01:50:41Z

I don't have batchlog. I have rerun the whole workflow. I think one of the issue (non consistent) is that I increased the combination in the makefile after the workflow was completed and some of the new banchmark dies not execute.

It is common to execute the whole workflow and try some some parameter combinations

btovar · 2020-10-07T11:25:59Z

Stefano, something that just occurred to me. Are you re-running the makeflow in place without a cleaning operation in between? It could be that makeflow is getting confused by a mismatch between the previous execution log and a newly modified makeflow.

stemangiola · 2020-10-07T11:28:48Z

Probably it is the case. But does cleaning lead to the deletion of the dependencies that are already completed. Of course if I delete the log everything gets deleted when the makeflow is called again

btovar · 2020-10-07T11:37:55Z

Yes, they will be deleted. A safer mode of operation in this case is to not modify the original file, but instead write the updates to differently named makeflow files. Then you can execute each update in sequence.

stemangiola · 2020-10-07T23:40:02Z

I understand, but this is not always possible in combinatorics scenario.

expand_grid(
	slope = c(-2, -1, -.5, .5, 1, 2), 
	foreign_prop = c(0, 0.5, 0.8),
	S = c(30, 60, 90),
	which_changing = 1:16,
	run = 1:5,
	method = c("ARMET", "cibersort", "llsr", "epic")
)

I can add arbitrary parameter space here with no effort. It would be great if makeflow could update the log file with the new dependencies, and just add them to the tree.

Otherwise makeflow would be suitable to only static workflows.

btovar · 2020-10-08T11:27:51Z

I think that just appending new rules may be workable, with the understanding that removing a rule, or changing a previously executed rule will result in failure. Would that be something helpful to your use case?

stemangiola · 2020-10-08T12:08:42Z

Yes. Usually when doing benchmarking we want to increase combinations. We don't need to delete rules as we can ignore already executed dependencies. And we would eliminate rules on another run if needed.

The issue is that if now I add rules to an existing makefile (with log) the only one executing are the new one at the bottom. The new one in the middle are ignored. This mixed behaviour seems more unwanted than designed.

btovar · 2020-10-08T12:12:41Z

Stefano, thanks for your input! Let me discuss it with the team.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is there a limit to the tree length? Some commands not executing despite similar ones are #38

is there a limit to the tree length? Some commands not executing despite similar ones are #38

stemangiola commented Oct 4, 2020

stemangiola commented Oct 4, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020 via email

btovar commented Oct 5, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020

stemangiola commented Oct 7, 2020

btovar commented Oct 7, 2020

stemangiola commented Oct 7, 2020

btovar commented Oct 7, 2020

stemangiola commented Oct 7, 2020

btovar commented Oct 8, 2020

stemangiola commented Oct 8, 2020

btovar commented Oct 8, 2020

is there a limit to the tree length? Some commands not executing despite similar ones are #38

is there a limit to the tree length? Some commands not executing despite similar ones are #38

Comments

stemangiola commented Oct 4, 2020

stemangiola commented Oct 4, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020 via email

btovar commented Oct 5, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020

stemangiola commented Oct 5, 2020

btovar commented Oct 5, 2020

stemangiola commented Oct 7, 2020

btovar commented Oct 7, 2020

stemangiola commented Oct 7, 2020

btovar commented Oct 7, 2020

stemangiola commented Oct 7, 2020

btovar commented Oct 8, 2020

stemangiola commented Oct 8, 2020

btovar commented Oct 8, 2020