-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unknown injections handling in offline #4360
Unknown injections handling in offline #4360
Conversation
I had been wanting to allow in-frame injections which have a file as well, in case of supplied known injections, but I realised that can be done using #4363 and overriding the |
100145f
to
9f92ddf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm happy with this. Is there an example output page or test?
@@ -467,6 +467,9 @@ fore_xmlall = wf.make_foreground_table(workflow, combined_bg_file, | |||
fore_xmlloudest = wf.make_foreground_table(workflow, combined_bg_file, | |||
hdfbank, rdir['open_box_result'], singles=insps, | |||
extension='.xml', tags=["xmlloudest"]) | |||
fore_hdfall = wf.make_foreground_table(workflow, combined_bg_file, | |||
hdfbank, rdir['open_box_result'], singles=insps, | |||
extension='.hdf', tags=["hdfall"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would appear to require changes to config files? Not sure I can see those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer, missed that!
Added now, and running test workflow to generate some results pages. I'll add some injection generation / searching into the CI tests as well (similar to the live CI)
c25d95f
to
8fc08e3
Compare
Note that changes since the approval will show up on GarethCabournDavies/pycbc@1f4991f...unknown_injections The URL has updated now that I've rebased to master |
3840827
to
43054a6
Compare
Note: Current failure in the CI will be fixed by #4404 |
43054a6
to
95b8cb2
Compare
Poke: Still failing search test. |
Searched through the logs and can't find any reasons why the coinc jobs for unknown injections are failing, just the transfers of the output file |
Okay, the CI failures are due to huge injinj / fullinj / injfull jobs which are being booted I think - at least they are taking a lot of time and memory on my local runs using the examples config I think that merging #4420 will fix the issues, so unfortunately, that becomes another dependency! |
cb04258
to
98a5f38
Compare
Results are currently showing that the injection run is using the full_data data. I think I have a solution, and am testing at the moment |
Looks like there's now conflicts here. Poke me when this is ready! |
98a5f38
to
a8e9692
Compare
Okay @spxiwh, this looks sensible on the results pages now - I thought the wrong data was being passed to the minifollowups, but it was me trying to overwrite rather than use a fresh directory I'm slightly worried that the config file syntax becomes very complicated at this point, but I suppose this is a non-standard use-case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think I'm largely okay with this, do see some of the comments though.
plotting.ini \ | ||
executables.ini \ | ||
injections_minimal.ini \ | ||
injections_unknown.ini \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By how much does this slow down the example search workflow. While we want to test everything, the CI is becoming overstressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert the changes to the CI after final approval, if that is best?
Handwaving discussion over time though:
- the workflow additions are all parallel to already-running jobs
- the generation of data with injections does take a few minutes of added time before the workflow starts, and is not parallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI only has a single machine, so cannot parallelize. It's going to have to run all the jobs here one by one (or perhaps two-by-two). Therefore we do need to know how much extra time this is adding, and if this feature is "niche" enough to warrant this, given that there's probably many possibilities in workflow that we are not testing.
elif injection_method == "IN_FRAME": | ||
# IN_FRAME means that we will supply frames with the appropriate | ||
# injections unknown to the analysis | ||
inj_files.append(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only substantial comment is that having a File be None is a hack. If anyone accidentally sent this file to a job, pegasus is going to [rightly] break. There doesn't seem to be a good way to handle this though ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, though people shouldn't do that, there wont be a nice error message to say what they have done which is wrong either
Okay, I'm going to remove this from my "to-do" list now. Please do consider CI time, but merge this once you are happy and the CI tests (including the search workflow!) have completed. If you make any more substantive changes, explicitly poke me again. |
@spxiwh is the use case for this going to come up in the near future to your knowledge? If so, we will want this to be on master |
This will need a substantial re-think, as the use-case I've implemented here is unlikely to be used in the near future What we are going to need is the case where injection frame files are supplied, and an injection definition file is also given, but we are not injecting them ourselves. As a result, I will close, and think about how to adapt this for that use-case. |
Allow the offline search to use injections which do not have an associated injection file.
For the use-case I am thinking of, we would be provided frame files with an unknown set of injections by an external team.
We would then use the alternative datafind settings on the config for the injected data, but use the original uninjected data for the background in the same way we do for known injections.
These would then show up as an additional minifollowup in the open box results, and a
.hdf
results file is made in the open box results directory which can be used for reporting resultsDraft status is due to ongoing testing, this will be removed once the testing has completed and passed sensible checks