Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add first draft of free-threading page for the guide #4577

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ngoldbaum
Copy link
Contributor

Moves the existing content on free-threading in the migration guide into its own page. Also adds some new content about things we know are going to be issues for some users.

Comments and suggestions are very welcome. I'd really appreciate ideas for illustrative code examples to add, if anyone has any.

necessary to rely on rust parallelism to achieve concurrent speedups using
PyO3. Instead, you can parallelise in Python using the
[`threading`](https://docs.python.org/3/library/threading.html) module, and
still expect to see see multicore speedups by exploiting threaded concurrency in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a see see typo.

Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good already, added some thoughts that came to mind while reading (from someone who hasn't looked a lot into free-threading yet)

(Nit: Maybe we can use a consistent capitalization of "Python" 🙃)

# Supporting Free-Threaded CPython

CPython 3.13 introduces an experimental build of CPython that does not rely on
the global interpreter lock for thread safety. As of version 0.23, PyO3 also has
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the global interpreter lock for thread safety. As of version 0.23, PyO3 also has
the global interpreter lock (GIL) for thread safety. As of version 0.23, PyO3 also has

Maybe we want to introduce the acronym for the following section?

Comment on lines 8 to 16
The main benefit for supporting free-threaded Python is that it is no longer
necessary to rely on rust parallelism to achieve concurrent speedups using
PyO3. Instead, you can parallelise in Python using the
[`threading`](https://docs.python.org/3/library/threading.html) module, and
still expect to see see multicore speedups by exploiting threaded concurrency in
Python, without any need to release the GIL. If you have ever needed to use
`multiprocessing` to achieve a speedup for some algorithm written in Python,
free-threading will likely allow the use of Python threads instead for the same
workflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to also talk a bit about what PyO3/Rust brings to the table here? Maybe about the "fearless concurrency" point, having Send/Sync allowing to safely build APIs that can be used in this new context?

Comment on lines 39 to 50
Instead, you can think about whether or not you a rust scope has access to a
Python **thread state** in `ATTACHED` status. See [PEP
703](https://peps.python.org/pep-0703/#thread-states) for more background about
Python thread states and status. In order to use the CPython C API in both the
GIL-enabled and free-threaded builds of CPython, you must own an attached
Python thread state. The `with_gil` function sets this up and releases the
thread state after the closure passed to `with_gil` finishes. Similarly, in both
the GIL-enabled and free-threaded build, you must use `allow_threads` in
order to use rust threads. Both of `with_gil` and `allow_threads` tell CPython
to put the Python thread state into `DETACHED` status. In the GIL-enabled build,
this is equivalent to releasing the GIL. In the free-threaded build, this unblocks
CPython from triggering a stop-the-world for a garbage collection pass.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very valuable information, but also pretty technical. Maybe we should put this in a <details> tab and summarize that on a higher level, so that it's a bit easier to grasp for new users. Maybe that "GIL" refers to the interaction with the Python interpreter or something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree, the main point that PyO3 still requires attachment to a Python thread to do Python work is a bit blurred with the technical details of how attachment works here.

Comment on lines 45 to 47
thread state after the closure passed to `with_gil` finishes. Similarly, in both
the GIL-enabled and free-threaded build, you must use `allow_threads` in
order to use rust threads. Both of `with_gil` and `allow_threads` tell CPython
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure how to interpret the "Rust threads" part. I don't think there is a problem in just spawning Rust thread an doing something different non Python related in the background. It's more about detaching the current thread from it's interaction with the Interpreter and handing back control, if I got that correctly.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is already looking great to me! I have a few suggestions too...

Comment on lines 39 to 50
Instead, you can think about whether or not you a rust scope has access to a
Python **thread state** in `ATTACHED` status. See [PEP
703](https://peps.python.org/pep-0703/#thread-states) for more background about
Python thread states and status. In order to use the CPython C API in both the
GIL-enabled and free-threaded builds of CPython, you must own an attached
Python thread state. The `with_gil` function sets this up and releases the
thread state after the closure passed to `with_gil` finishes. Similarly, in both
the GIL-enabled and free-threaded build, you must use `allow_threads` in
order to use rust threads. Both of `with_gil` and `allow_threads` tell CPython
to put the Python thread state into `DETACHED` status. In the GIL-enabled build,
this is equivalent to releasing the GIL. In the free-threaded build, this unblocks
CPython from triggering a stop-the-world for a garbage collection pass.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree, the main point that PyO3 still requires attachment to a Python thread to do Python work is a bit blurred with the technical details of how attachment works here.

Comment on lines 54 to 59
If you wrote code that makes strong assumptions about the GIL protecting shared
mutable state, it may not currently be straightforward to support free-threaded
Python without the risk of runtime mutable borrow panics. PyO3 does not lock
access to python state, so if more than one thread tries to access a python
object that has already been mutably borrowed, only runtime checking enforces
safety around mutably aliased data owned by the Python interpreter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here there is a point to be made that the user would have knowingly made such an assumption by using unsafe impl for Send and/or Sync (especially if the PyO3 API is correct in its requirements of these).

object that has already been mutably borrowed, only runtime checking enforces
safety around mutably aliased data owned by the Python interpreter.

It was always possible to generate panics like this in PyO3 in code that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to link to ./class.md#bound-and-interior-mutability somewhere here.

Comment on lines 65 to 66
We will allow user-selectable semantics for for mutable pyclass definitions in
PyO3 0.24, allowing some form of opt-in locking to emulate the GIL if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe lessen "we will" to "we plan to" ? 🙈

@ngoldbaum
Copy link
Contributor Author

Thanks for all the comments! I hope the new text addresses the concerns and is clearer.

I'd like to add sections on the critical section and PyMutex wrappers, assuming those get merged, and also maybe another section in the docs about multithreaded programming using PyO3? I definitely found it nontrivial to learn and things like std::thread::scope should be pointed at explicitly in the PyO3 docs.

Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, this reads a lot easier to me.

I guess my main concern is still what terminology we use, when we want to talk about interacting with the interpreter.

I think there are now 3 different wordings in different paragraphs. Sometimes it's still referred to as "GIL", then we have "thread state" and we have "attached to the runtime". I think unifying them makes sense to not overload readers. Personally I like the "attached to the runtime" variant because it's high level enough that someone without much experience can get an idea about whats going on (I think?) and it's independent of the build-mode. We can also add a later (sub)section explaining what "attached to the runtime" means as technical documentation.

Comment on lines +69 to +73
In the GIL-enabled build, releasing the GIL allows other threads to
proceed. This is no longer necessary in the free-threaded build, but you should
still release the GIL when doing long-running tasks that do not require the
CPython runtime, since releasing the GIL unblocks running the Python garbage
collector and freeing unused memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit weird because you are talking about releasing the GIL in free-threaded context. Maybe we should introduce some wording independent of the build mode which we can use in such situations. Maybe something like "detaching from the runtime/interpreter" or similar.

Comment on lines +63 to +67
attached thread state. The CPython runtime also assumes it is responsible for
creating and destroying threads, so it is necessary to detach from the runtime
before creating any native threads outside of the CPython runtime. In the
GIL-enabled build, this corresponds to dropping the GIL with an `allow_threads`
call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requirement still seems new to me and I could not find any mention of it in the current guide. Do you have any source I could read more about that?

Here https://pyo3.rs/v0.22.2/parallelism.html we also use rayon without any allow_threads needed. I believe rayon also creates it's global threadpool lazily (but I haven't checked).

But on the general we can not ensure that any API from some crate a user might call does not internally spawn a thread. So this would severely limit what is "ok" to do...

Comment on lines +51 to +59
In order to use the CPython C API in both the GIL-enabled and free-threaded
builds of CPython, the thread calling into the C API must own an attached Python
thread state. In the GIL-enabled build the thread that holds the GIL by
definition is attached to a valid Python thread state, and therefore only one
thread at a time can call into the C API.

What a thread releases the GIL, the Python thread state owned by that thread is
detached from the interpreter runtime, and it is not valid to call into the
CPython C API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already reads much easier to me! 👍

However I think "thread state" is still a term that a new user does not have any idea of. In my understanding the main point we want to bring here is that any thread calling into Python needs to be "attached to the Interpreter". In the GIL-build there can only ever by one such thread, and the free threaded build there can be multiple of them. Would it be too much of a simplification of just for example "any thread calling into the C API must be attached to the Python Interpreter"? This is still abstract but maybe this can give an idea about the high level interaction without throwing a user directly into the middle of CPython.

Comment on lines +19 to +23
PyO3's support for free-threaded Python will enable authoring native Python
extensions that are thread-safe by construction, with much stronger safety
guarantees than C extensions. Our goal is to enable ["fearless
concurrency"](https://doc.rust-lang.org/book/ch16-00-concurrency.html) in the
native Python runtime by building on the rust `Send` and `Sync` traits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, I like this a lot! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants