Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix coroutine continuation cleanup #181

Merged
merged 4 commits into from
Jul 18, 2024

Conversation

jb-gcx
Copy link
Contributor

@jb-gcx jb-gcx commented Jul 15, 2024

C++ coroutines clean up their local variables after invoking return_value, return_void or unhandled_exception. This means that executing a continuation in one of these methods basically inverts the cleanup order. Example:

#include <Future.hpp>
#include <iostream>
#include <thread>

struct LogOnDestruct {
    std::string message;
    ~LogOnDestruct() { std::cout << message << "\n"; }
};

djinni::Future<void> foo() {
    LogOnDestruct log{"~foo"};

    // The issue only occurs when we suspend
    // So make sure to suspend on another future
    djinni::Promise<void> promise;
    std::thread{[&promise] {
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
        promise.setValue();
    }}.detach();
    co_return co_await promise.getFuture();
}

djinni::Future<void> bar() {
    LogOnDestruct log{"~bar"};
    co_await foo();
}

int main() {
    bar().get();
    return 0;
}

Before this PR, the above code produced:

~bar
~foo

Which is inverted (caller gets cleaned up before callee). This can even lead to crashes etc, if a reference to a local variable from bar is used during cleanup in foo - because that local variable in bar was already cleaned up!

After this PR the code produces:

~foo
~bar

As it should.

Merge considerations

  • I use a requires, which is a C++20 feature. I decided that this is acceptable, since coroutines themselves are also a C++20 feature. If this is not okay, let me know and I'll change it to a more compatible SFINAE approach.
  • I'd like to add a test for this, but I'm not sure how. There is no dedicated C++ test suite and it would take me a while to create one as I have zero bazel experience. Should I just add an XCTest, or what's the preferred way?

@li-feng-sc
Copy link
Contributor

Thank you for this contribution. Regarding your questions:

  • Although Snap has been using C++20 for a while, we haven't get to refresh Djinni to use C++20. There are lots of other places that needs to be cleaned up for C++20 and I would prefer to do them together. So let's try to avoid requires for now if this is not too much trouble.
  • Yeah the tests are focused on testing interoperability so no pure C++ tests. A XCTest is fine.

@jb-gcx jb-gcx force-pushed the gcx/fix-coroutine-continuation branch from 5793459 to ee2bcb1 Compare July 16, 2024 10:41
@jb-gcx
Copy link
Contributor Author

jb-gcx commented Jul 16, 2024

Thanks for the feedback. I've replaced requires with enable_if and added an XCTest. Please take another look.

constexpr bool await_ready() const noexcept {
return false;
}
bool await_suspend(std::coroutine_handle<ConcretePromise> finished) const noexcept {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the alias detail::CoroutineHandle for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

@@ -16,6 +16,8 @@

#pragma once

#include "expected.hpp"

#include <atomic>
#include <functional>
#include <memory>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find I have to change the feature detection to the following to make it work in c++17 + -fcoroutine-ts

#if defined(__cpp_coroutines) && __has_include(<experimental/coroutine>)
    #include <experimental/coroutine>
    namespace djinni::detail {
        template <typename Promise = void> using CoroutineHandle = std::experimental::coroutine_handle<Promise>;
        using SuspendNever = std::experimental::suspend_never;
    }
    #define DJINNI_FUTURE_HAS_COROUTINE_SUPPORT 1
#elif defined(__cpp_impl_coroutine) && __has_include(<coroutine>)
    #include <coroutine>
    namespace djinni::detail {
        template <typename Promise = void> using CoroutineHandle = std::coroutine_handle<Promise>;
        using SuspendNever = std::suspend_never;
    }
    #define DJINNI_FUTURE_HAS_COROUTINE_SUPPORT 1
#endif

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like with these settings the <coroutine> header is available but not functional. Your suggestion fixes that nicely, but I found another issue: with clang 16 and -std=c++20 -stdlib=libc++, it chooses the experimental implementation even though the proper one is available.

I've done some experiments:

  • __cpp_coroutines is available for both C++20 and C++17 + -fcoroutines-ts until clang 17 when the macro and -fcoroutines-ts were completely removed
  • __cpp_impl_coroutine and __cpp_lib_coroutine are only available in proper C++20 mode when we have <coroutine>

I've just pushed another suggestion based on these findings. Here's what I used to test it with a few compilers/settings: https://godbolt.org/z/rj737E5eq

Caveat: I've tested only with clang. For iOS and Android that seems to be the relevant compiler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! this works really well.

@jb-gcx jb-gcx requested a review from li-feng-sc July 18, 2024 10:38
@li-feng-sc li-feng-sc merged commit a97afc7 into Snapchat:main Jul 18, 2024
1 check passed
@jb-gcx jb-gcx deleted the gcx/fix-coroutine-continuation branch July 19, 2024 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants