Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make GILOnceCell threadsafe #4512

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

davidhewitt
Copy link
Member

This changes GILOnceCell to be thread-safe. I do this by adding a std::sync::Once to the GILOnceCell, which blocks multiple writers from concurrently writing to the GIL (this is almost exactly how std::sync::OnceLock works).

This comes at making accesses to the GILOnceCell cost an atomic load. I do this on all builds for simplicity of our implementation, so there is a bit of slowdown on the non-free threaded builds, but I don't think this will be catastrophic. I also think it's better to make the performance characteristics the same for consistency.

I considered using a Python critical section here instead of the Once, but I came to realise that was not necessary for the short-lived lock around the write.

cc @ngoldbaum
(cc @colesbury)

@alex
Copy link
Contributor

alex commented Sep 2, 2024

A few questions:

  1. AFAICT, GILOnceCell no longer meaningfully relies on the GIL at all, and is more or less the same as OnceLock. Am I understanding correctly?
  2. I'm not sure I understand why the deadlock scenario described in https://pyo3.rs/v0.22.2/faq#im-experiencing-deadlocks-using-pyo3-with-lazy_static-or-once_cell is no longer possible.

@alex
Copy link
Contributor

alex commented Sep 2, 2024

Ok, on deeper review, I believe there's a difference between this behavior and OnceLock:

When get_or_init() is called, if the cell is not already initialized, this will call f() without the Once lock held, thus multiple threads may call f() concurrently, and only the first writer's value is retained. This is distinct from OnceLock, where f() is called with the lock held.

This actually addresses both of my questions. However, I think it makes the documentation a bit misleading: the deadlock prevention here has nothing to do with GILOnceCell's reliance on the GIL, and everything to do with the behavior I just described.

@davidhewitt
Copy link
Member Author

Yes exactly, this change preserves the existing runtime semantics while removing reliance on the GIL. Agreed the documentation is now out of date, will correct that. The name is also unfortunate now that there is no reliance on the GIL.

I think that multiple concurrent calls to f() is likely to be unhelpful, so I think we should also add PyOnceLock which guarantees only a single call to f(). I think I have an implementation which is a reasonably thin wrapper around std::sync::OnceLock (with a backport for MSRV).

A possible alternative is to disable GILOnceCell on the freethreaded build (like we have decided to do so with GILProtected), and recommend migration to PyOnceLock.

@davidhewitt
Copy link
Member Author

Actually, I started writing PyOnceLock and found that for all operations except for initialization I wanted to forward to OnceLock. I also imagined that it was possible there might be a mix of initialization under Python and outside of Python, so forcing users to initialize only under Python was potentially unhelpful (i.e. by making get_or_init take py: Python<'_>).

Instead, I wonder if an extension trait to add a helper method to OnceLock to do a dance with the GIL (or the GC) is sufficient: #4513

Copy link

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In init, there are two lines:

        let value = f()?;
        let _ = self.set(py, value);

You can get behavior that's closer to the GIL build by wrapping the two lines in a Py_BEGIN/END_CRITICAL_SECTION. Basically, for some (but not all) implementations of f(), the GIL ensured that f() was only called once. A Py_BEGIN/END_CRITICAL_SECTION should ensure those same f's are called only once without the risk of introducing a deadlock.

@ngoldbaum ngoldbaum mentioned this pull request Sep 20, 2024
3 tasks
@davidhewitt
Copy link
Member Author

Thanks @colesbury

Ref your comment in the other thread:

There is an awkward bit, which is that we only made the PyCriticalSection_Begin variants that take a PyObject public, and for these use cases it would be a lot cleaner to use PyMutex. I don't think that's a showstopper -- you can either create a dummy PyObject or translate the _PyCriticalSection_BeginMutex function to Rust.

I took a look at translating _PyCriticalSection_BeginMutex to Rust but it relies on a lot of other internal functions and ultimately access to PyThreadState which we deliberately don't wrap. So I think the only option would be a dummy PyObject. Is it sound to leave the structure contents (other than the mutex) uninitialized, or do I need to create a real PyObject?

Either option makes me a bit uneasy so I might just prefer to merge this PR without the critical section and add that improvement later.

@davidhewitt
Copy link
Member Author

Ok, I think this is now ready for review. I've updated the documentation to no longer state that the implementation no longer depends on the GIL, while still describing how the interaction between them works.

@colesbury
Copy link

Is it sound to leave the structure contents (other than the mutex) uninitialized, or do I need to create a real PyObject?

It won't crash, but it's the kind of thing that might break in the future. It's probably safer to at least use PyObject_HEAD_INIT and initialize the ob_type to some valid type, even if it's a dummy type.

Either option makes me a bit uneasy so I might just prefer to merge this PR without the critical section and add that improvement later.

That makes sense to me.

We should also work on exposing _PyCriticalSection_BeginMutex for 3.14 since it would avoid these issues.

Copy link
Contributor

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not enough of a rust expert to reason through the safety implications, but I have some minor doc suggestions and I carefully read through the code and didn't spot any issues.

src/sync.rs Outdated Show resolved Hide resolved
src/sync.rs Show resolved Hide resolved
src/sync.rs Outdated Show resolved Hide resolved
// inside the `call_once_force` closure.
unsafe {
// `.take().unwrap()` will never panic
(*self.data.get()).write(value.take().unwrap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat, I'd never seen this pattern before, it took me a few minutes to puzzle out but it's a nice pattern only consuming the value if it's actually written.

/// let cell = GILOnceCell::new();
/// {
/// let s = String::new();
/// let _ = Python::with_gil(|py| cell.set(py,A(&s)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the builds is failing here:

error[E0597]: `s` does not live long enough
  --> src/sync.rs:142:50
   |
24 |     let s = String::new();
   |         - binding `s` declared here
25 |     let _ = Python::with_gil(|py| cell.set(py,A(&s)));
   |                              ----                ^ borrowed value does not live long enough
   |                              |
   |                              value captured here
26 | }
   | - `s` dropped here while still borrowed
27 | } _doctest_main_src_sync_rs_121_0() }
   | - borrow might be used here, when `cell` is dropped and runs the `Drop` code for type `GILOnceCell`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my mistake, this is meant to be marked compile_fail as a UI test that the type behaves properly (imported from std).

davidhewitt and others added 2 commits September 30, 2024 19:49
Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>
@davidhewitt
Copy link
Member Author

Does anyone else want to review or add opinions to this before we merge this? I think this is probably the next big step in getting the freethreaded support working for 0.23

@mejrs
Copy link
Member

mejrs commented Oct 1, 2024

Get and set still take a Python but don't need it, should we remove it?

@davidhewitt
Copy link
Member Author

Get and set still take a Python but don't need it, should we remove it?

Great question. I opted to leave them to avoid the breaking change, especially if we think that in the long run we might remove this type. Perhaps we punt on that for now and can always remove them in a future release?

@davidhewitt
Copy link
Member Author

Will proceed to merge this, thanks all for the reviews and feedback!

@davidhewitt davidhewitt added this pull request to the merge queue Oct 4, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Oct 4, 2024
@davidhewitt davidhewitt added this pull request to the merge queue Oct 4, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Oct 4, 2024
@davidhewitt davidhewitt added this pull request to the merge queue Oct 4, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 4, 2024
@davidhewitt
Copy link
Member Author

The failure seems flaky (and I can't reproduce locally, so I'm going to retry this one last time...)

@davidhewitt davidhewitt added this pull request to the merge queue Oct 5, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants