-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI Continuations Proposal #6
Comments
After the discussion at the 01/26/2022 virtual meeting I've mulled over different options for how to integrate persistent continuations for persistent operations (i.e., continuations that remain attached to persistent operations after they've executed). The model I came up with retains the hidden property of continuations and provides semantics for moving requests from one continuation to another and for freeing a subset of requests without jeopardizing correctness (at least I like to believe so). Here is the updated API: int MPI_Continue(
MPI_Request *op_req,
MPI_Continue_cb_function cb,
void *cb_data,
int flags,
MPI_Status *status,
MPI_Request cont_req);
int MPI_Continueall(
int count,
MPI_Request op_req[],
MPI_Continue_cb_function cb,
void *cb_data,
int flags,
MPI_Status status[],
MPI_Request cont_req); Notice that
In the case of Starting a persistent operation arms the continuation so that it will trigger once all relevant operations have armed it as well and completed. Thus, reattaching another continuation to an active operation potentially leads to a race condition and is thus erroneous. A continuation may be executed once all requests tied to it
This ensures that no continuation is executed while any operation tied to it has not left the inactive state. Non-persistent continuations disappear once they were executed and persistent operations will have no continuation attached to them afterwards. It is possible to attach persistent continuations to non-persistent operations. After the continuation has executed, the non-persistent requests will have been freed, leaving the persistent continuation attached to only the persistent operations in the set of operations (if any). If a persistent continuation was attached to only non-persistent operations the continuation behaves as if it was a non-persistent continuation (it disappears once all non-persistent operations have been freed). It is not possible to reattach a continuation to a subset of operations. Instead, a new continuation should be created. The cost will be similar to trying to tie a subset of operations to an existing continuation. The solution above is cleaner as it does not explicitly expose continuation objects to the application space. |
Note
The below description reflects an earlier version of the MPI Continuations proposal and is kept for historical purposes. The current version of the proposal can be found in https://github.com/mpiwg-hybrid/mpi-standard/pull/1 and in the following PDF:
https://github.com/mpiwg-hybrid/mpi-standard/files/14565813/continuations_202403011.pdf
Background
MPI provides support for all sorts of non-blocking operations (pt2pt, collectives, RMA, I/O), each returning a request object that can be used to test and wait for the completion of the operation. Once an operation is complete, applications typically react to that change in state, e.g., by deallocating the use buffer, processing the received message, or starting subsequent operations. The required polling on the requests is impractical for applications that are able to overlap communication with additional work, such as processing available tasks. Request management may become cumbersome and error-prone esp in multi-threaded applications.
Proposal
This proposal introduces a flexible interface for attaching so-called continuations to operation requests. Continuations are actions that are invoked by the MPI library once the completion of an operation is detected. A maximum of one continuation may be attached to any request object and the MPI implementtion takes back the ownership of any non-persistent request and no copy of the request may be used to test/wait for the completion of the operation. Persistent requests remain valid but may be used to test/wait for the operation to complete after the continuation has been attached but no second continuation may be attached to it before its completion. It is unspecified whether the continuation has completed execution when a call to
MPI_Test
/MPI_Wait
on a persistent operation request. Execution of the continuation may be deferred to a later point.Continuations may be attached to a single operation request (
MPI_Continue
) or a set of requests (MPI_Continueall
):The latter will cause the continuation to be invoked once all of the provided operations have completed. For each operation request, a status may be provided that will be set before the continuation is invoked. The provided buffer containing the status(es) will be passed to the continuation callback, along with the provided
cb_data
pointer.MPI_STATUS_IGNORE
/MPI_STATUSES_IGNORE
may be passed instead to the registration function, which would then be passed to the callback instead.Continuation Requests
The continuation is attached to the operation request(s) and registered to the continuation request (
cont_request
above). Continuation requests are allocated usingMPI_Continue_init
:Continuation requests accumulate outstanding continuations and can be used to test/wait for their completion. Continuation request may themselves have a continuation attached to them, which will be invoked once all registered continuations have completed executing. They can also be used to progress outstanding continuations by calling
MPI_Test
on them.Continuation request are persistent but are not started explicitly. Instead, continuation requests are started implicitly when the first continuation is registered after initialization or previous completion.
Execution Context
By default, continuations may be invoked by any application thread calling into the MPI library. Two info keys for calls to
MPI_Continue_init
are provided to restrict the execution:"mpi_continue_poll_only"
: if set to"true"
continuations are only invoked whenMPI_Test
orMPI_Wait
is called on the continuation request with which the continuations are registered. (default:"false"
, i.e., the continuation may be executed at any time)"mpi_continue_thread"
: may be"application"
(only application threads may execute continuations) or"any"
(any thread may execute continuations, incl. MPI progress threads, if availabe). (default:"application"
)Further Info Keys
"mpi_continue_enqueue_complete"
: if"true"
and upon attaching a continuation to a set of requests all operation are complete, the continuation is enqueued for later execution (e.g., while polling for on the continuation request). Otherwise, continuations may be executed immediately inside the call toMPI_Continue
/MPI_Continueall
if all operations were immediately complete. (default:"false"
)"mpi_continue_max_poll"
: the maximum number of continuations to execute when polling (callingMPI_Test
) on the continuation request. (default:"-1"
, i.e., as many as possible)"mpi_continue_async_signal_safe"
: if true, the continuation is async-signal-safe and may be called from within a signal handler. (default:"false"
)Resources
The current PDF:
mpi40-report-continuations.pdf
Proposal PR: TBD
Open Questions
A list of open questions (to be used to track discussions):
Integration with Sessions
MPI_Session_continue_init
?Status handling
MPI_Continue
andMPI_Continueall
, given that one would be passedMPI_STATUS_IGNORE
and the otherMPI_STATUSES_IGNORE
?General
MPI_Continueany
(potentially more resource efficient by reusing the same data structure for several continuations) orMPI_Continuesome
(what would the semantics be?)The text was updated successfully, but these errors were encountered: