Skip to content
This repository has been archived by the owner on May 18, 2021. It is now read-only.

Transactional Creation of Flows #116

Open
gviedma-zz opened this issue Nov 29, 2017 · 4 comments
Open

Transactional Creation of Flows #116

gviedma-zz opened this issue Nov 29, 2017 · 4 comments

Comments

@gviedma-zz
Copy link

Currently, when a flow is created by the primary function invocation, the stages are executed eagerly as the flow is constructed. If adding a stage subsequently fails, the user's flow will be partially executed and partially constructed. Stages that executed prior to the construction failure could have side effects, and any subsequent error-handling will be skipped, preventing the user from managing rollbacks.

For example, a function invocation could attempt to construct the following flow

Flows.currentFlow().invokeFunction(otherFunction).thenApply(operation).exceptionally(rollback);

successfully creating an invoke stage and executing otherFunction with side-effects. It could then fail to create the next stage associated with operation. This failure will bubble up and the function call will fail, also preventing the exceptionally stage from being constructed and running the rollback logic.

As a developer of Fn Flow applications with side-effects, I would like to be able to execute the above types of flows safely in a transactional manner, such that any operations associated with my flow are only executed once the flow has been safely constructed and my error-handling is in place.

@gviedma-zz
Copy link
Author

I see a couple of approaches to provide transactional execution semantics:

  1. Create the complete flow with a one-shot client request that serializes the fully constructed flow on the client-side and pushes it to the Flow Server. This operation only succeeds once the complete flow is committed on the server, at which point it is safe to start execution.

  2. FDKs automatically commit any stages that were created during the current function invocation after the user's handler finishes executing. The commit request to the Flow Server should explicitly enumerate the stage IDs that it is committing, to disambiguate from concurrent (hot) invocations of the same function.

Approach 2) is most inline with our current approach and would require minimal changes to the Flow protocol. It would require a new commitStages API operation to be added to the protocol which the Flow Server would use to mark individual flow stages as committed for execution.

When it comes to supporting the Flow Server await (FKD get ) operation that blocks waiting for a stage's value, the FDKs could implicitly commit all the stages created prior to the get call.

The current commit operation as it stands is not used for transactional execution guarantees but rather to designate that the primary Flow function invocation has been successfully completed and the associated flow can be completed and passivated on the server side once there are no pending stages left to execute. It makes sense to maintain such global commit semantics, but we should ensure that the FKD makes this commit right before the completion of the primary function invocation exclusively. Currently, every flow function invocation makes a global commit, which could result in a race condition where concurrent invocations of the function complete prior to the primary invocation, marking the flow as completable before it is safe to do so.

@zootalures
Copy link
Member

zootalures commented Nov 29, 2017

I'd like to see a UX/API treatment for this - especially in Java how do (do?) we expose this .

main questions:

  • is there a sensible default and what is it
  • is this completely standalone as a feature (i.e. valuable without stage retries)

So would be good to see this expressed as an example/explanation from a user POV

@gviedma-zz gviedma-zz changed the title Transactional Execution of Flows Transactional Creation of Flows Dec 11, 2017
@gviedma-zz
Copy link
Author

gviedma-zz commented Dec 11, 2017

It is worth distinguishing two types of errors for the sake of this discussion:

  1. Errors that occur during creation of a flow.

IMO flow creation should always be implicitly transactional, without imposing any additional burden to the function developer. Errors preventing a flow from being created are always platform or server-side errors and are thus inherently a platform concern. The developer has little or no control over them and they are unrelated to the application logic. There are also safety concerns around eager execution of stages that may have side-effects, before subsequent error-handling stages have been successfully attached to a flow (performing rollback, for instance).

Therefore, I suggest that we:

  • only commit and start executing a flow's pending stages once the function's current invocation has successfully completed

  • fail the current invocation, and do not commit/execute any pending stages on encountering any errors that would prevent us from adding any of that invocation's pending stages to a flow

  1. Errors that occur during a flow's execution.

I believe providing transactional semantics around the execution of one or more stages is an application concern and needs to be exposed to the user, most likely having UX implications. This is beyond the scope of this ticket which instead focuses on 1).

@jan-g
Copy link
Contributor

jan-g commented Dec 11, 2017

Noting some changes/corner cases in what the user'd see (with implicit or explicit transactions) in addition to await (which I'd also lean toward issuing an implicit commit).

  • for any variant on "externally completable" stages: we want the stage to exist before we publish a URL to poke it with. Do we autocommit on requesting such a URL?

  • for any complete/cancel operations that poke a value directly into a stage, it need not be an error for those to run against an uncommitted txn.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants