-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Progress monitoring #23
Comments
Since you're looking to build a dashboard, would it be useful not only to retrieve the status of the tasks, but the Graph representation of the workflows? |
@psafont that's interesting but it's not immediately obvious how one would show the status of say, 50 workflows in a single view by using a graph representation. I was thinking in terms of a tabular view with one row per workflow/job, e.g., |
We've played around w/ the idea in Cromwell but one thing I have always found is that getting people to agree on what that graph should look like is so difficult that it's easier just to give something lo-fi like @brucehoff suggests. Keep in mind that this isn't supposed to be some uber API, but rather something that sits beneath whatever layer the user is ultimately interacting with. That layer can easily generate a visual graph based on information provided in WES |
That was my suspicion too, good to see this kind of granularity can be thrown out early on. Useful information for users about tasks can be elapsed time, status of the task and even maybe why it's pending (waiting on resources / an input). Some of the information could be coarser, e.g. showing statistics about pending tasks instead of showing all the tasks individually, but I don't know that much about the use cases. |
Right now, the Any suggestions on how best to close this? Should WES offer a specific endpoint for reporting status using schematized messages? Should we specify a very clearly formatted log message that WES services will know to parse into progress? |
I would suggest (1) extending the |
@brucehoff -- I might be mis-understanding here -- but is it helpful to just get a guarantee of key/value pairs? Any WES implementation with active usage certainly has this need, no doubt about it at all. I just think this is the type of thing that's hard to standardize -- and I'm not sure guaranteeing some structured information enforces what you'd like. Instead -- this is where implementors should have flexibility and just listen to their users and come up with creative solutions for showing progress of a workflow -- thoughts? |
@brucehoff curious to know whether this is still an issue given the current implementation of WES |
@patmagee Are you saying that there is an update to WES that obviates the need for this requested feature? If so, please let me know what that is and I will be happy to comment on whether it meets the perceived need. |
@brucehoff You can get the current running task from the list of tasks in the From WES's perspective, I do not think it's reasonable to know the percentage of a tasks completion, since there is really no way to know this in a generalized way. Many bioinformatics tools can run for minutes, hours or longer and they do not report progress so I am not sure how WES could know what percentage done the individual tasks were (unless you meant the overall progress through the workflow... which is still a hard problem). In the same vein, I am not sure how WES would be able to report things like model convergence since that is a very specific problem and requires wes to understand WHAT it is running. At the moment WES does not need to understand the context of a workflow, it simply is the distribution and reporting API. It would be great to work more machine learning concepts into WES, but maybe as a separate extension or even a custom implementation (there is nothing preventing an implementor from adding that) So if the tasks are sufficient, then I think we can close this. |
I was reminded today of at least policy, if not more technical standards around "return of results" with patient data. I feel like the issue here almost merits a separate Cloud WS standard for "return of status" — at least a common data model, if not an API. WES and TES, maybe even TRS, could interpret and parse data communicated in this model to display for monitoring or other purposes. I agree that, similar to workflow languages themselves, specifying a common model for run status is beyond the scope of WES. However, we could include the idea in a wish list / backlog for future Cloud standards. |
Sure there is: The running workflow can communicate this information to the WES-compliant workflow execution engine which can, in turn, return this to the client which initiated the workflow.
Again, the running workflow can communicate this information to the WES-compliant workflow execution engine which can, in turn, return this to the client which initiated the workflow. The overarching idea is to expand the scope of the standard from an API that only the client interacts with to an API that the running workflow can also interact with. |
I agree that it would be awesome to have WES interact with workflow engines to report these stats (and possibly others in the future). It goes a bit in the direction of what @denis-yuen proposed in ga4gh/tool-registry-service-schemas#223 (and also see ga4gh/tool-registry-service-schemas#224 and ga4gh/tool-registry-service-schemas#225) for TRS (and, btw, runtime stats reported back from users to TRS might be another way of addressing this issue to some extent). However, I'm not quite sure how to start with this. Coming up with a model in WES for workflow (or worse, TES for tool) developers in a sort of vacuum doesn't seem to me to be very promising. But following @jaeddy: If we have at least one strong use case to drive this and can bring at least two or so workflow types to the table who commit to work it out together and develop such a data model together with the WES team, I think that would be really nice. Would you be willing to drive this @brucehoff? Also tagging @mr-c, @pditommaso, @johanneskoester, @bgruening |
There are use cases for retrieving intermediate information from a workflow, e.g.:
Under the existing API, such in-progress information would come from retrieving and reading or parsing the workflow's log files. If the information could be returned in a more structured form (e.g. in a set of key-value pairs) then, when used with #21 a client could create a dashboard of running workflows and/or answer questions like "which of my jobs is closest to complete"?
The text was updated successfully, but these errors were encountered: