Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

biomake -Q slurm prints an error when the subdirectory does not exist #53

Open
sjackman opened this issue Jan 18, 2018 · 14 comments
Open
Labels

Comments

@sjackman
Copy link
Contributor

sjackman commented Jan 18, 2018

Even though it prints this noisy error message, it seems to complete successfully. It looks like may be failing to create the file foo/.biomake/slurm/job/bar.

foo/bar:
	mkdir -p $(@D)
	touch $@
$ biomake -Q slurm foo/bar
Target foo/bar not materialized - build required
Exception: error(existence_error(directory,/home/sjackman/tmp/foo/.biomake),context(system:make_directory/1,No such file or directory))
  [25] backtrace(99) at /home/sjackman/.linuxbrew/Cellar/swi-prolog/7.6.4/libexec/lib/swipl-7.6.4/library/prolog_stack.pl:444
  [24] prolog_exception_hook('<garbage_collected>',_528,539,501) at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/biomake.pl:107
  [23] <meta call>
  [22] make_directory("/home/sjackman/tmp/foo/.biomake") <foreign>
  [21] catch(utils:make_directory("/home/sjackman/tmp/foo/.biomake"),error(existence_error(directory,"/home/sjackman/tmp/foo/.biomake"),context(...,'No such file or directory')),utils:fail) at /home/sjackman/.linuxbrew/Cellar/swi-prolog/7.6.4/libexec/lib/swipl-7.6.4/boot/init.pl:371
  [20] utils:safe_make_directory("/home/sjackman/tmp/foo/.biomake") at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/utils.pl:314
  [19] utils:biomake_make_subdir_list('/home/sjackman/tmp/foo','<garbage_collected>') at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/utils.pl:306
  [18] utils:biomake_private_filename_dir_exists('foo/bar',[slurm,"job"],_894) at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/utils.pl:300
  [17] queue:run_execs_with_qsub(slurm,rb('foo/bar',[],true,["mkdir -p foo"|...],v(_1012,'foo/bar',[],...)),[],[queue(slurm),...|...]) at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/queue.pl:59
  [15] biomake:dispatch_run_execs('<garbage_collected>',[],[queue(slurm),...|...]) at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/biomake.pl:468
  [14] biomake:run_execs_and_update('<garbage_collected>',[],'<garbage_collected>') at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/biomake.pl:451
  [13] biomake:build('foo/bar',[],[queue(slurm),...|...]) at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/biomake.pl:164
  [11] '$apply':forall('<garbage_collected>',user:build(_1344,...)) at /home/sjackman/.linuxbrew/Cellar/swi-prolog/7.6.4/libexec/lib/swipl-7.6.4/boot/apply.pl:51
  [10] build_toplevel([queue(slurm),...|...]) at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/cli.pl:41
   [9] main at /home/sjackman/.linuxbrew/Cellar/biomake/HEAD-5c63168/prolog/biomake/cli.pl:17
   [8] '<meta-call>'('<garbage_collected>') <foreign>
   [7] catch(user:(...,...),_1568,'$toplevel':true) at /home/sjackman/.linuxbrew/Cellar/swi-prolog/7.6.4/libexec/lib/swipl-7.6.4/boot/init.pl:371

Note: some frames are missing due to last-call optimization.
Re-run your program in debug mode (:- debug.) to get more detail.
Submitting job: sbatch -o /home/sjackman/tmp/foo/.biomake/slurm/out/bar -e /home/sjackman/tmp/foo/.biomake/slurm/err/bar     --parsable /home/sjackman/tmp/foo/.biomake/slurm/script/bar >/home/sjackman/tmp/foo/.biomake/slurm/job/bar
foo/bar queued for rebuild
@sjackman
Copy link
Contributor Author

sjackman commented Jan 18, 2018

If biomake creates that foo/.biomake directory, the Makefile needs to be structured so that it doesn't fail if the directory already exists. For example, removing the -p from mkdir would fail.

foo/bar:
	mkdir $(@D)
	touch $@
❯❯❯ biomake -Q slurm foo/bar
❯❯❯ cat foo/.biomake/slurm/err/bar
mkdir: cannot create directory ‘foo’: File exists

@ihh ihh added the Bug label Jan 18, 2018
@ihh
Copy link
Member

ihh commented Jan 18, 2018

yes, biomake currently leverages the filesystem far too intimately; it assumes that the subdirectory is there, to stash its internal files in the hidden directory .biomake

This is certainly ugly but, as you point out, you can work around it easily enough by structuring the Makefile to ensure the directory exists, e.g. by adding an extra rule converting your Makefile to something like

all: foo/bar

%/bar: %
	echo hello there >$@

foo:
	mkdir -p $@

If that doesn't work for you, I get it, but it seems to me that the only other option is to have biomake anticipate that mkdir -p somehow, which seems wrong to me.

Of course, biomake could also find some way of storing its internal queue-related info that doesn't rely on the directory hierarchy being in place. I accept that this is a bug due to a design flaw (over-reliance on filesystem)...

Open to additional thoughts on this.... incidentally, if bug reports can be reproduced using the internal thread pool queue implementation (-Q poolq) as opposed to things like slurm or sge that require those engines to be present (or simulated) for debugging, that'd be awesome ;)

@ihh
Copy link
Member

ihh commented Jan 18, 2018

hmmm, now i am wondering if that example I gave would work... actually it might still run into the same issue. going to think about this some more...

@ihh
Copy link
Member

ihh commented Jan 18, 2018

Perhaps the best thing is just to do the mkdir -p implicitly after all.... hmmmmm..... any thoughts @cmungall @sjackman ?

@sjackman
Copy link
Contributor Author

sjackman commented Jan 18, 2018

I'm okay with doing the mkdir -p implicitly. Note that will break the example that I give in #53 (comment) due to mkdir: cannot create directory ‘foo’: File exists.

I do in fact like the design of biomake and how it leverages the file system. It does lead to this issue though.

I avoid whenever possible having a rule depend on a directory, because the directories time stamp is pretty unstable. Any time a file within that directory changes, the time stamp of the directory changes. A simple workaround is to the time stamp issue follows. I think it suffers from the same issue though, of foo/ not existing when Biomake creates its metafile.

all: foo/bar

%/bar: %/stamp
	echo hello there >$@

foo/stamp:
	mkdir -p $(@D)
	touch $@

@sjackman
Copy link
Contributor Author

sjackman commented Jan 18, 2018

I'm troubleshooting a program that appears to misbehave when it's output directory already exists, before it starts running.

Here's a suggestion. The current Biomake metadata directory is

./$(@D)/.biomake/slurm/xxx/$(@F)

How about instead…

./.biomake/slurm/xxx/$(@D)/$(@F)

where xxx is err job out script. It would resolve this issue, and it has the nice side effect of the directory structure of ./.biomake/slurm/out exactly mirroring the real directory structure of output files.

@ihh
Copy link
Member

ihh commented Jan 18, 2018

@sjackman The problem with that is that the biomake directory for target X would then depend on which directory biomake was invoked in, as well as the location of X.

In general this seems a bit fragile to me, and it would clearly break some cases (e.g. caching the MD5 checksum of a file)

@sjackman
Copy link
Contributor Author

Is your concern specifically with recursive Makefile? That is, a Makefile in each subdirectory? I usually use one sole Makefile in the top-level. I don't think your concern affects that situation.

@ihh
Copy link
Member

ihh commented Jan 18, 2018

There are a few situations I can imagine where having the target metadata associated with the Makefile location (or the current working directory), rather than the target location, would be a bad idea.

One such situation would be including Makefiles from other Makefiles. Another would be the case (common-ish for me) where a Makefile is refactored/split into one or more other Makefiles. Or indeed any situation where the Makefile is moved.

Another gotcha situation would be if the Makefile uses absolute paths and the user expects to be able to invoke it from other, arbitrary directories.

Make is already fragile enough in such situations - c.f. hacks involving the $(MAKE), $(MAKEFILE_LIST) and $(MAKE_ARGS) variables. I'm not convinced that it's an improvement to go from a requirement that subdirectories be present (which at least is a transparent problem, barfy error message notwithstanding) to a requirement that biomake always be invoked from the same directory.

@sjackman
Copy link
Contributor Author

I'm suggesting that the .biomake directory be in the current working directory, not relative to the location of the Makefile, which I agree would be a bad idea, so the location of the Makefile shouldn't matter. Absolute paths would be trouble.

I'm okay with closing this issue as a known limitation if that's your preference. It would be nice to avoid the ugly error message though if the directory doesn't already exist. Biomake should probably create the directory in that case I'd say.

@ihh
Copy link
Member

ihh commented Jan 18, 2018

I'm still kind of hoping that an elegant solution will present itself... e.g. if @cmungall jumps in...

I can certainly mitigate it by

  • improving the error message
  • adding a command-line option (or special target) that directs biomake to create intermediate directories as needed

It may be a few days before i have another chance to do some biomake hacking, so let's leave it open for discussion until then at least

@ihh
Copy link
Member

ihh commented Jan 18, 2018

Also, if we allow for switchable behavior, there are other options... for example the location of metadata directories (in CWD vs in target dir) could be a user-changeable option... seems messy but perhaps the only way to cover all cases...

@sjackman
Copy link
Contributor Author

adding a command-line option (or special target) that directs biomake to create intermediate directories as needed

Oddly, I believe it's already doing this, making the directory. Even though it prints the error message above, it still continues happily on and creates all the necessary directories and files.

@sjackman
Copy link
Contributor Author

Also, if we allow for switchable behavior, there are other options... for example the location of metadata directories (in CWD vs in target dir) could be a user-changeable option... seems messy but perhaps the only way to cover all cases...

For my case without recursion, I'd find that helpful, .biomake in CWD. I have a workaround in the mean time. I'll think on it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants