-
Notifications
You must be signed in to change notification settings - Fork 40
Enrich workflow activity context #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Javier Aliaga <javier@diagrid.io>
- WorkflowInstanceId: This field will provide a unique identifier for the workflow instance. It is already available in the orchestration context but will now be propagated to the activity context. | ||
|
||
- TaskInstanceId: This field will provide a unique identifier for the same activity among retries. This new field will be part of the [Activity Request](https://github.com/dapr/durabletask-protobuf/blob/main/protos/orchestrator_service.proto) and needs to be populated in the runtime. | ||
|
||
- RetryAttempt: This field will contain the current retry count for the activity execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the expected proto changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
### Related proposals | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't there some overlap between this proposal and Workflow: Rerun from Activity #80?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think so, as a rerun is a completely new run of the activity
|
||
The proposed fields introduced in this document are: | ||
|
||
- WorkflowInstanceId: This field will provide a unique identifier for the workflow instance. It is already available in the orchestration context but will now be propagated to the activity context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this unique identifier built? same question for TaskInstanceId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
- TaskInstanceId: This field will provide a unique identifier for the same activity among retries. This new field will be part of the [Activity Request](https://github.com/dapr/durabletask-protobuf/blob/main/protos/orchestrator_service.proto) and needs to be populated in the runtime. | ||
|
||
- RetryAttempt: This field will contain the current retry count for the activity execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a retry the same as a rerun?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think so. I would say that rerun are part of the orchestration events and retries are part of the task execution failure.
Signed-off-by: Javier Aliaga <javier@diagrid.io>
Signed-off-by: Javier Aliaga <javier@diagrid.io>
Sort of continuing my thoughts on last weeks (5/20) release call, I don't follow how activity identifiers provide stronger guarantees against re-execution or otherwise help track activity execution state. Event Source Log Already Handles ReplaysThe event source log persists activity inputs and outputs when an activity successfully completes. During a workflow replay (I'm not talking about the proposed re-run operation), the system reuses persisted outputs and only re-executes activities with no matching log. This already ensures idempotency between replays without needing activity identifiers:
An activity identifier doesn't change this logic - it adds a redundant check without improving reliability. Idempotency Isn't Required for ActivitiesDapr recommends, but doesn't require, idempotent activities. Developers can ensure one-time external calls per-workflow by isolating them in activities because once that result is logged, the activity isn't going to run again. If some sort of uniqueness is needed, developers can generate a deterministic GUID within the workflow context and pass it into the activity as an input, so that activity-level identifier doesn't add value here either. Ultimately, I don't see how this functionality meaningfully improves workflows. The concerns raised are already addressed by the current implementation and adding identifiers to activity contexts doesn't significantly simplify the current approach. While clearer documentation could help to resolve some of the confusion around best practices, I disagree that this is a necessary change. |
Thanks @WhitWaldo for your feedback. This proposal is mostly for idempotency and other patterns not covered by the replay. As you say you can achieve the idempotency by passing the unique identifier as part of the input, but in my experience it feels more natural to be part of the context as you do not have to modify your input to support it. I will look a bit more how other workflow engines supports this, we already check with Temporal and they provide a unique activity Id. I also have been looking a bit more into other patterns, for example, in a case a user of workflows want to monitor all invocations for a given workflow, should they also modify their input to include it? It would be handy to have the instanceId as part of the activity to send it to all third party systems. Again, as you say this proposal can be achieve by other ways, but having more information in the activity context I think it makes it easy to the users to implement this patterns. |
Temporal certainly diverges in several ways from Dapr Workflows - given the limited bandwidth everyone has for each release, for my two cents, I'd rather see the energy flow more towards fixing bugs and enabling new functionality that's not otherwise achievable over something like this, seeing as it is more of an alternative way of doing something already easily achievable and which already has helper method support. Case in point, the workflow must be deterministic and those non-deterministic operations (like getting the current timestamp or creating a GUID) must be placed in activities so the output is persisted for replays. But to enable and simplify precisely this sort of scenario, the With respect to monitoring, identifiers don't really facilitate that scenario quite as much as Cassie's event streaming proposal might, especially coupled with more verbose tracing already produced by the runtime (which presumably already reflects the workflow ID) and reflecting input values. Especially as the activity identifiers themselves aren't addressable, seeing them in the logs would be less useful to me than knowing which piece of the larger work was being tackled by that activity. |
@WhitWaldo I was thinking on logs. I can see a pattern to create a workflow run with instanceId as the correlationId so then you can propagate that id to all different systems and get a unified log |
Thinking in terms of Temporal, they actually break out the notion of Workflows as an entity and as a run so the latter has a separate identifier with each execution. While Dapr Workflows isn't well set up to do the same today, I think having some sort of unique Workflow Execution identifier correlating all the activities across all executions that's paired with another identifier that's unique per Workflow Replay would be useful as a pair of correlation identifiers. But that would need to happen pretty much entirely at the runtime and wouldn't itself be reflected at all in the SDKs. |
@WhitWaldo I simplified the proposal to make only reference to idempotency |
I'm sure we'll talk about this in the release call today, but I'm still going to double down and say that an actor-level identifier is not necessary for idempotency purposes. That's already handled by how the event source log works today. |
That's not accurate, activities can be retried at the daprd level for any number of reasons, including gRPC transport level errors. Users today are reporting multiple activations of activities. |
I guess I remain confused about what this proposal is advocating for then. I read it as asking for activities to have an identifier populated in the activity context that users could read and pass to third-party services as an idempotency token - if that's what's being asked for, I would still suggest this isn't necessary because that identifier can be generated in a replay-safe way on the workflow context and passed in via the input. While I probably could have worded my last response more clearly, I wasn't suggesting that activities don't ever get re-run - as I articulated in an earlier comment, they don't get re-run when they complete successfully. My point being that having the identifier saved alongside the event source log seems duplicative because if there's a completed log for the activity, it ran and it won't again - we don't need an identifier to validate this. If multiple activations are being observed, that sounds like a runtime bug and not something that can or should be addressed by changes to the SDKs, but I didn't see anything about that in this proposal. |
@WhitWaldo the main reason for this is:
I took a look to the event logs and I think having this identifier will also help there. Just an example, I created the following workflow: And this is the event log:
As you can see it is not easy to track the log for one of the activities, even if you have the idempotency/execution key as part of the input it is not present in all the entry logs for the activity execution, like the |
I don't dispute the potential value of having an activity run-specific identifier in the activity context for potential idempotency support, but this would need some consideration with Josh's re-run feature to identify whether that identifier should change between re-runs (meaning it's not useful for idempotency support) or not. Rather, it seems like you'd need an approach as I'd suggested to keep this value consistent between re-run invocations (with a deterministic value passed in from the workflow). On the release call today, this was clarified as having a separate activity ID for each re-run (think of it as a hash specific to the inputs) but the same activity ID for each retry. |
While talking through the re-run functionality on today's release call, it occurred to me - idempotency, traceability, etc. aside, isn't this just an identifier for the activities themselves so they can be identified as the re-run/clone starting point? If that's the case, since we're moving forward with re-run/clone support, this seems like a must-have or there'd be no easy way to identify the starting activity in the cloned workflow. |
Signed-off-by: Javier Aliaga <javier@diagrid.io>
a2fa23d
to
5899d13
Compare
I agree with @WhitWaldo remarks here.
It really is trivial to pass in an additional property on an input model to an Activity, which contains the Idempotency Key to be utilised in the Activity. Also bare in mind that if you truly want to support complete idempotency chains via this proposal, you need to allow an option for a user to specify the Idempotency Key on the Activity Call signature (rather than have Dapr originate one for you implicitly) -- All we've done at this point is shifted the user-defined Idempotency Key from the input model to the method signature, which IMO is not worth it. This can all be solved with better documentation and code examples. |
+1 binding |
1 similar comment
+1 binding |
Preview