-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Triggering Argo Workflow pipelines #819
Comments
I have assumed in this that OurAPI extracts authentication information and information about sending information back into the plan from either its ENVironment or from additional paramets unshown |
@run_workflow is something like a Msg pre-processor that when it sees a StartDocument inserts the run_id into a message to the Workflow engine. |
This shouldn't be a problem since subscribe returns the uid for that subscription, so the decorator handling the subscriptions can keep track of everything Also, for interest, MX main uses for callbacks are:
In Hyperion, we run these in one external process and use Tagging @DominicOram for interest too |
Any processing that cannot be run in the workflow engine (proprietary analysis?) can still make use of the APIs for consuming/sending data but a way of passing authn information in will be required [probably as a small additional note on the Copier template?]. How will workflows be organised? One repository that in an ArgoCD like way is loaded for the workflow engine? |
So the MX use cases right now are fire-and-forget style callbacks, without any further introspection required for the plans, or any introspection is e.g. loaded from ISPyB? |
For the external callbacks, yes (as far as I can see). We do have a few internal callbacks in Hyperion which we introspect during the plan, but these are doing very small things and Argo wouldn't need to know about it |
sequenceDiagram
actor Alice
participant Keycloak
participant Blueapi
participant Tiled
participant Argo Workflows
Alice <<->> Keycloak: Log in to blueapi & cache <run_token>
Note left of Keycloak: scopes=["data:write"]
Note left of Blueapi: @run_workflow("my_processing")<br/> def my_plan()
Alice ->>+ Blueapi: run my_plan with <run_token>
Blueapi <<->> Keycloak: on-behalf-of Alice request <process_token><br> with <run_token>
Note right of Keycloak: scopes=["data:write", "data:read", "analysis:my_processing"]
Blueapi ->> Argo Workflows: run my_processing with <process_token>
create participant my_processing
Argo Workflows ->> my_processing: Creates
loop until plan finishes
Blueapi ->> Tiled: Insert docs callback
Tiled ->>+ my_processing: Tiled API get data
my_processing ->>- Blueapi: New workflow API inform decisions
end
Blueapi ->> Argo Workflows: Finished
Destroy my_processing
Argo Workflows ->> my_processing: Finished
Blueapi ->>- Alice: Finished
|
An additional use-case of note is that some of our callbacks rely on other callbacks. Which we use the
Yes, they are fire-and-forget but this has caused issues:
When we first started |
This all looks good from my point of view, but I don't think we should get too far into the design without trying something simple first: Trigger a post-processing job at the end of a scan, no feedback into the plan etc. |
Been talking a lot with @olliesilvester today about how this is going to be accomplished, so wanted to get some words down.
Referencing and sometimes citing https://diamondlightsource.github.io/workflows/docs/
Ollie is working on enabling small in-process callbacks that are required for some plans. Because of the plan stub subscribe, this can be done from within a plan without needing to touch the RunEngine, but a way to make sure that the RunEngine unsubscribes again from the subscription id would be useful, else we risk gaining a growing overlapping set of callbacks with every run.
Because these callbacks are being run within the same python process as blueapi, they are limited by the resources given, which should remain minimal. When callbacks grow to require more resources or speed or repeatability, they should be extracted into the workflow system.
We therefore need to consider how to trigger those workflows. Blueapi already has one persistent callback to send documents to the message bus for the client and other services- the following assumes that this callback remains in place either to send messages to a bus or to insert documents into a document store.
In order to enable a complete experiment I believe we need:
A simple plan that triggers an external workflow
A simple workflow, for which there is a copier template (or similar) to build a container and register it with the workflow engine with the name "my_workflow", such that the only thing required to create a new workflow is defining my_analysis or my_per_point_analysis or both (? Something entirely within this python file only)
Acceptance Criteria
The text was updated successfully, but these errors were encountered: