Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPANK does not recognize io-watchdog properly #1

Open
GoogleCodeExporter opened this issue May 11, 2015 · 0 comments
Open

SPANK does not recognize io-watchdog properly #1

GoogleCodeExporter opened this issue May 11, 2015 · 0 comments

Comments

@GoogleCodeExporter
Copy link

Hi,
I would like to link io-watchdog to my slurm installation, so hanging jobs can 
be stopped.

Currently, I'm doing some tests with a LLNL resilience library called SCR. I've 
set up four nodes and installed all the necessary software in  order to run MPI 
jobs using SLURM. One of this nodes acts as the SLURM controller and the rest 
as compute nodes. This system is working properly at the moment.

OS: Debian 3.2.51-1 (Squeeze)
SLURM Version: 2.6.4
SLURM Spank Plugins version: 0.23
io-watchdog version: 0.8

To link io-watchdog with SLURM, I need to install and configure SPANK so it can 
load dynamically the library when the user calls srun, as io-watchdog 
documentation says. After a successfull installation, I added io-watchdog 
dynamic library path to /etc/ld.so.conf.d/ and included the following lines in 
plugstack.conf:
required /usr/local/lib/io-watchdog/io-watchdog.so
required /usr/local/lib/io-watchdog-interposer.so

(I checked that those paths are right)

However, the following info appears on the logs when starting slurm daemon:
# tail -n14 slurmd.log 
[2013-11-22T09:21:55.719] debug:  spank: opening plugin stack 
/usr/local/etc/plugstack.conf
[2013-11-22T09:21:55.723] debug3: Couldn't find sym 'slurm_spank_init' in the 
plugin
[2013-11-22T09:21:55.728] debug3: Couldn't find sym 'slurm_spank_slurmd_init' 
in the plugin
[2013-11-22T09:21:55.732] debug3: Couldn't find sym 'slurm_spank_job_prolog' in 
the plugin
[2013-11-22T09:21:55.736] debug3: Couldn't find sym 'slurm_spank_init_post_opt' 
in the plugin
[2013-11-22T09:21:55.740] debug3: Couldn't find sym 
'slurm_spank_local_user_init' in the plugin
[2013-11-22T09:21:55.744] debug3: Couldn't find sym 'slurm_spank_user_init' in 
the plugin
[2013-11-22T09:21:55.750] debug3: Couldn't find sym 
'slurm_spank_task_init_privileged' in the plugin
[2013-11-22T09:21:55.754] debug3: Couldn't find sym 
'slurm_spank_task_post_fork' in the plugin
[2013-11-22T09:21:55.760] debug3: Couldn't find sym 'slurm_spank_task_exit' in 
the plugin
[2013-11-22T09:21:55.764] debug3: Couldn't find sym 'slurm_spank_job_epilog' in 
the plugin
[2013-11-22T09:21:55.769] debug3: Couldn't find sym 'slurm_spank_slurmd_exit' 
in the plugin
[2013-11-22T09:21:55.773] debug3: Couldn't find sym 'slurm_spank_exit' in the 
plugin
[2013-11-22T09:21:55.778] debug2: spank: 
/usr/local/lib/io-watchdog/io-watchdog.so: no callbacks in this context

Note: Full log file is attached to this message.

Could version incompatibility be the reason why SPANK doesnt use io-watchdog 
library?

Thanks in advance,
 Jorge

Original issue reported on code.google.com by jbellonc...@gmail.com on 22 Nov 2013 at 9:33

Attachments:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant