-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow option for workflow to end at inspiral jobs #4612
Conversation
aeba20e
to
b18a2c3
Compare
@@ -187,8 +188,76 @@ insps = wf.merge_single_detector_hdf_files(workflow, hdfbank, | |||
insps, output_dir, | |||
tags=['full_data']) | |||
|
|||
# Check for the only-do section. If inspiral=true then it will only do the workflow up until the merged inspiral hdf files | |||
if 'only-do' in workflow.cp.sections(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option may need a better name so it is clear in the config file. Maybe something more verbose?
else: | ||
only_do_inspiral = False | ||
|
||
if only_do_inspiral: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like we could combine with the code at the end to remove the redundance here. Maybe put into a function and then we can simply call from either location?
As further clarification, this is useful in the case where you will do all the post-processing at a different cluster. Once can finish the computationally expensive parts on the OSG and then transfer the data to a local cluster to do the later workflow stages. |
Thanks @kkacanja . A couple of comments/requests from me on this:
Maybe the easiest way to achieve all of this is to put large blocks of the code into a number of functions in the executable and don't call some of them (or return early from the function) if this option is given. The code is already split up into logical blocks: do main filtering, do plots, do injection filtering, do injection plots, etc. Each of these could be made a function. |
b18a2c3
to
4e17321
Compare
@@ -183,6 +257,17 @@ ind_insps = insps = wf.setup_matchedfltr_workflow(workflow, analyzable_segs, | |||
datafind_files, splitbank_files_fd, | |||
output_dir, tags=['full_data']) | |||
|
|||
|
|||
if workflow.cp.has_option('workflow', 'stop-after-inspiral'): | |||
stop_after_inspiral = workflow.cp.getboolean('workflow', 'stop-after-inspiral') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this
stop-after = inspiral (statmap, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkacanja Change the one thing noted, and test locally. If that works, I think this can then be merged. Adding additional stop points as people need can be left to the future. I think adding one for inspiral and statmap are fine here.
layout.single_layout(rdir['workflow'], ([dashboard_file, gen_file_html, log_file_html])) | ||
sys.exit(0) | ||
|
||
def check_stop(job_name, container, workflow, finalize_workflow): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to add a short comment here to explain what the function does.
* Updates to stopping after data inspiral jobs and statmap * Updates to stopping after data inspiral jobs and statmap * Changed typing of stop-after * Allow pycbc_make_offline_search_workflow to stop after inspiral jobs. * Fixed comment
* Updates to stopping after data inspiral jobs and statmap * Updates to stopping after data inspiral jobs and statmap * Changed typing of stop-after * Allow pycbc_make_offline_search_workflow to stop after inspiral jobs. * Fixed comment
* Updates to stopping after data inspiral jobs and statmap * Updates to stopping after data inspiral jobs and statmap * Changed typing of stop-after * Allow pycbc_make_offline_search_workflow to stop after inspiral jobs. * Fixed comment
Added a the option stop-after under [workflow] where you can supply the options ['inspiral', 'hdf_trigger_merge', 'statmap'] for the time being to only do the workflow up until the specified job.
This is useful for clusters such as OSG where inspiral jobs can be parallelized to one core jobs and be parallelized over many nodes. Single processing is not ideal on OSG so this option is useful for these types of clusters.
This does not take into account injection inspiral jobs.
Also, when using these options, make sure to change the file-retention level in [workflow] such that your files do not get deleted when the workflow finishes running.