Skip to content

[FEATURE] Cancel long-running calculations #1249

@mgovers

Description

@mgovers

Describe the feature request
If the user expects a calculation to take milliseconds but then it takes a couple minutes, then there's probably something wrong, and maybe they want to cancel the calculation.

This is especially useful for people running code in (jupyter) notebooks. A simple typo may cause the duration of a calculation to blow up significantly, especially since the introduction of multi-dimensional datasets in #1201. If a user realizes that, they may want to cancel the process as soon as they can without having to kill and restart the entire kernel.

This type of process cancellation typically is handled using KeyboardInterrupt (Ctrl+C events), signal handling (e.g. SIGINT), by setting a flag or by using stop tokens.

Considerations

  • Cfr. Python documentation (https://docs.python.org/3/library/signal.html#execution-of-python-signal-handlers):

    A long-running calculation implemented purely in C (such as regular expression matching on a large body of text) may run uninterrupted for an arbitrary amount of time, regardless of any signals received. The Python signal handlers will be called when the calculation finishes.

    This has implications:

    • We need to double-check that sending an interrupt to the PGM C API as called from the same thread is possible. Note that this may be platform-dependent.
      • If not, maybe we need to run long operations (PowerGridModel constructors, calculations and (de)serializations) in a separate worker thread from Python.
    • We do not need to consider cancelability on the deepest level. It is fine to only check after every scenario whether a stop token was requested
  • Also cfr. Python documentation (https://docs.python.org/3/library/exceptions.html#KeyboardInterrupt and https://docs.python.org/3/library/signal.html#note-on-signal-handlers-and-exceptions):

    [...] applications that are complex or require high reliability should avoid raising exceptions from signal handlers. They should also avoid catching KeyboardInterrupt as a means of gracefully shutting down. Instead, they should install their own SIGINT handler.

    • We should follow their recommendations.
  • XCode has supported std::jthread starting with version 26 (cfr. https://developer.apple.com/documentation/xcode-release-notes/xcode-26-release-notes). Other compilers already have supported it for much longer. This means that we can finally use it. It comes with a built-in stop token feature, which enables simplified thread cancellation handling.

Design Proposal (requires experimentation before finalization)

  • C++ core: support canceling multi-threaded calculations
    • Shift to using std::jthread instead of std::thread, which has a stop token, which allows cancelling threads
    • In the calculations, check for stop tokens in-between scenarios. NOTE: we do not need to check within a scenario cfr. the considerations mentioned before.
    • Register a signal handler
    • Raise a new exception for canceled operations (e.g. OperationCanceled)
  • C API: add a stop-token feature.
    • add a function PGM_request_stop.
    • Handle the stop request, either by (TBD):
      • Sending a signal to each thread; or
      • Setting the stop token for each job that is in-progress for the current handle.
    • An interrupted calculation should report a new exception type, e.g. OperationCancelled
  • Python wrapper:
  • Documentation: Aside from basic Python/C API reference, I do not believe we need any, as it is "intuitive" to try to cancel an operation

To test

  • Start a notebook or interactive Python session.
  • Create a very long-running PGM batch calculation (at least a couple seconds or minutes, but not hours; verify that indeed the calculation takes this long)
  • Re-run the calculation, but this time, run Ctrl+C (or Delete); this should raise a KeyboardInterrupt
  • The PGM calculation should abort quickly (potentially not immediately, but definitely faster than in the first run)

Originally posted by @mgovers in #1245 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or request

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions