-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
PEP 797: Shared Object Proxies #4536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
8b87881
d1e458f
c2b9e68
ac75e38
57410a2
c970dc1
21c6590
db66d08
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,368 @@ | ||||||||
| PEP: 797 | ||||||||
| Title: Shared Object Proxies | ||||||||
| Author: Peter Bierma <peter@python.org> | ||||||||
| Discussions-To: Pending | ||||||||
| Status: Draft | ||||||||
| Type: Standards Track | ||||||||
| Created: 08-Aug-2025 | ||||||||
| Python-Version: 3.15 | ||||||||
| Post-History: `01-Jul-2025 <https://discuss.python.org/t/97306>`__ | ||||||||
|
|
||||||||
|
|
||||||||
| Abstract | ||||||||
| ======== | ||||||||
|
|
||||||||
| This PEP introduces a new :func:`~concurrent.interpreters.share` function to | ||||||||
| the :mod:`concurrent.interpreters` module, which allows any arbitrary object | ||||||||
| to be shared across interpreters using an object proxy, at the cost of being | ||||||||
| less efficient under multithreaded code. | ||||||||
|
|
||||||||
| For example: | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| with open("spanish_inquisition.txt") as unshareable: | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. always use an encoding!
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We're in 3.15 now, UTF-8 is finally the default! |
||||||||
| interp = interpreters.create() | ||||||||
| proxy = interpreters.share(unshareable) | ||||||||
| interp.prepare_main(file=proxy) | ||||||||
| interp.exec("file.write('I didn't expect the Spanish Inquisition')") | ||||||||
|
|
||||||||
| Motivation | ||||||||
| ========== | ||||||||
|
|
||||||||
| Many Objects Cannot be Shared Between Subinterpreters | ||||||||
| ----------------------------------------------------- | ||||||||
|
|
||||||||
| In Python 3.14, the new :mod:`concurrent.interpreters` module can be used to | ||||||||
| create multiple interpreters in a single Python process. This works well for | ||||||||
| code without shared state, but since one of the primary applications of | ||||||||
| subinterpreters is to bypass the :term:`global interpreter lock`, it is | ||||||||
| fairly common for programs to require highly-complex data structures that are | ||||||||
| not easily shareable. In turn, this damages the practicality of | ||||||||
| subinterpreters for concurrency. | ||||||||
|
|
||||||||
| As of writing, subinterpreters can only share :ref:`a handful of types | ||||||||
| <interp-object-sharing>` natively, relying on the :mod:`pickle` module | ||||||||
| for other types. This can be very limited, as many types of objects cannot be | ||||||||
| serialized with ``pickle`` (such as file objects returned by :func:`open`). | ||||||||
| Additionally, serialization can be a very expensive operation, which is not | ||||||||
| ideal for multithreaded applications. | ||||||||
|
|
||||||||
| Rationale | ||||||||
| ========= | ||||||||
|
|
||||||||
| A Fallback for Object Sharing | ||||||||
| ----------------------------- | ||||||||
|
|
||||||||
| A shared object proxy is designed to be a fallback for sharing an object | ||||||||
| between interpreters. A shared object proxy should only be used as | ||||||||
| a last-resort for highly complex objects that cannot be serialized or shared | ||||||||
| in any other way. | ||||||||
|
|
||||||||
| This means that even if this PEP is accepted, there is still benefit in | ||||||||
| implementing other methods to share objects between interpreters. | ||||||||
|
|
||||||||
|
|
||||||||
| Specification | ||||||||
| ============= | ||||||||
|
|
||||||||
| .. class:: concurrent.interpreters.SharedObjectProxy | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does this do? |
||||||||
|
|
||||||||
| A proxy type that allows access to an object across multiple interpreters. | ||||||||
| This cannot be constructed from Python; instead, use the | ||||||||
| :func:`~concurrent.interpreters.share` function. | ||||||||
|
|
||||||||
|
|
||||||||
| .. function:: concurrent.interpreters.share(obj) | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
|
|
||||||||
| Wrap *obj* in a :class:`~concurrent.interpreters.SharedObjectProxy`, | ||||||||
| allowing it to be used in other interpreter APIs as if it were natively | ||||||||
| shareable. | ||||||||
|
|
||||||||
| If *obj* is natively shareable, this function does not create a proxy and | ||||||||
| simply returns *obj*. | ||||||||
|
|
||||||||
|
|
||||||||
| Interpreter Switching | ||||||||
| --------------------- | ||||||||
|
|
||||||||
| When interacting with the wrapped object, the proxy will switch to the | ||||||||
| interpreter in which the object was created. This must happen for any access | ||||||||
| to the object, such accessing attributes or making modifications to the object's | ||||||||
| :term:`reference count`. To visualize, ``foo`` in the following code is only | ||||||||
| ever called in the main interpreter, despite being accessed in subinterpreters | ||||||||
| through a proxy: | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| def foo(): | ||||||||
| assert interpreters.get_current() == interpreters.get_main() | ||||||||
|
|
||||||||
| interp = interpreters.create() | ||||||||
| proxy = interpreters.share(foo) | ||||||||
| interp.prepare_main(foo=proxy) | ||||||||
| interp.exec("foo()") | ||||||||
|
|
||||||||
|
|
||||||||
| Multithreaded Scaling | ||||||||
| --------------------- | ||||||||
|
|
||||||||
| To switch to a wrapped object's interpreter, an object proxy must swap the | ||||||||
| :term:`attached thread state` of the current thread, which will in turn wait | ||||||||
| on the :term:`GIL` of the target interpreter, if it is enabled. This means that | ||||||||
| a shared object proxy will experience contention when accessed concurrently, | ||||||||
| but are still useful for multicore threading, since other threads in the | ||||||||
| interpreter are free to execute while waiting on the GIL of the target | ||||||||
| interpreter. | ||||||||
|
|
||||||||
| As an example, imagine that multiple interpreters want to write a log through | ||||||||
| a proxy for the main interpreter, but don't want to constantly wait on the log. | ||||||||
| By accessing the proxy in a separate thread for each interpreter, the thread | ||||||||
| performing the computation can still execute while accessing the proxy. | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| def write_log(message): | ||||||||
| print(message) | ||||||||
|
|
||||||||
| def execute(n, write_log): | ||||||||
| from threading import Thread | ||||||||
| from queue import Queue | ||||||||
|
|
||||||||
| log = Queue() | ||||||||
|
|
||||||||
| # By performing this in a separate thread, 'execute' can still run | ||||||||
| # while the log is being accessed by the main interpreter. | ||||||||
| def log_queue_loop(): | ||||||||
| while True: | ||||||||
| write_log(log.get()) | ||||||||
|
|
||||||||
| thread = Thread(target=log_queue_loop) | ||||||||
| thread.start() | ||||||||
|
|
||||||||
| for i in range(100000): | ||||||||
| n ** i | ||||||||
| log.put(f"Completed an iteration: {i}") | ||||||||
|
|
||||||||
| thread.join() | ||||||||
|
|
||||||||
| proxy = interpreters.share(write_log) | ||||||||
| for n in range(4): | ||||||||
| interp = interpreters.create() | ||||||||
| interp.call_in_thread(execute, n, proxy) | ||||||||
|
|
||||||||
|
|
||||||||
| Proxy Copying | ||||||||
| ------------- | ||||||||
|
|
||||||||
| Contrary to what one might think, a shared object proxy itself can only be used | ||||||||
| in one interpreter, because the proxy's reference count is not thread-safe | ||||||||
| (and thus cannot be accessed from multiple interpreters). Instead, when crossing | ||||||||
| an interpreter boundary, a new proxy is created for the target interpreter that | ||||||||
| wraps the same object as the original proxy. | ||||||||
|
|
||||||||
| For example, in the following code, there are two proxies created, not just one. | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| interp = interpreters.create() | ||||||||
| foo = object() | ||||||||
| proxy = interpreters.share(foo) | ||||||||
|
|
||||||||
| # The proxy crosses an interpreter boundary here. 'proxy' is *not* directly | ||||||||
| # send to 'interp'. Instead, a new proxy is created for 'interp', and the | ||||||||
| # reference to 'foo' is merely copied. Thus, both interpreters have their | ||||||||
| # own proxy that are wrapping the same object. | ||||||||
| interp.prepare_main(proxy=proxy) | ||||||||
|
|
||||||||
|
|
||||||||
| Thread-local State | ||||||||
| ------------------ | ||||||||
|
|
||||||||
| Accessing an object proxy will retain information stored on the current | ||||||||
| :term:`thread state`, such as thread-local variables stored by | ||||||||
| :class:`threading.local` and context variables stored by :mod:`contextvars`. | ||||||||
| This allows the following case to work correctly: | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
| from threading import local | ||||||||
|
|
||||||||
| thread_local = local() | ||||||||
| thread_local.value = 1 | ||||||||
|
|
||||||||
| def foo(): | ||||||||
| assert thread_local.value == 1 | ||||||||
|
|
||||||||
| interp = interpreters.create() | ||||||||
| proxy = interpreters.share(foo) | ||||||||
| interp.prepare_main(foo=proxy) | ||||||||
| interp.exec("foo()") | ||||||||
|
|
||||||||
| In order to retain thread-local data when accessing an object proxy, each | ||||||||
| thread will have to keep track of the last used thread state for | ||||||||
| each interpreter. In C, this behavior looks like this: | ||||||||
|
|
||||||||
| .. code-block:: c | ||||||||
|
|
||||||||
| // Error checking has been omitted for brevity | ||||||||
| PyThreadState *tstate = PyThreadState_New(interp); | ||||||||
|
|
||||||||
| // By swapping the current thread state to 'interp', 'tstate' will be | ||||||||
| // associated with 'interp' for the current thread. That means that accessing | ||||||||
| // a shared object proxy will use 'tstate' instead of creating its own | ||||||||
| // thread state. | ||||||||
| PyThreadState *save = PyThreadState_Swap(tstate); | ||||||||
|
|
||||||||
| // 'save' is now the most recently used thread state, so shared object | ||||||||
| // proxies in this thread will use it instead of 'tstate' when accessing | ||||||||
| // 'interp'. | ||||||||
| PyThreadState_Swap(save); | ||||||||
|
|
||||||||
| In the event that no thread state exists for an interpreter in a given thread, | ||||||||
| a shared object proxy will create its own thread state that will be owned by | ||||||||
| the interpreter (meaning it will not be destroyed until interpreter | ||||||||
| finalization), which will persist across all shared object proxy accesses in | ||||||||
| the thread. In other words, a shared object proxy ensures that thread local | ||||||||
| variables and similar state will not disappear. | ||||||||
|
|
||||||||
|
|
||||||||
| Memory Management | ||||||||
| ----------------- | ||||||||
|
|
||||||||
| All proxy objects hold a :term:`strong reference` to the object that they | ||||||||
| wrap. As such, destruction of a shared object proxy may trigger destruction | ||||||||
| of the wrapped object if the proxy holds the last reference to it, even if | ||||||||
| the proxy belongs to a different interpreter. For example: | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| interp = interpreters.create() | ||||||||
| foo = object() | ||||||||
| proxy = interpreters.share(foo) | ||||||||
| interp.prepare_main(proxy=proxy) | ||||||||
| del proxy, foo | ||||||||
|
|
||||||||
| # 'foo' is still alive at this point, because the proxy in 'interp' still | ||||||||
| # holds a reference to it. Destruction of 'interp' will then trigger the | ||||||||
| # destruction of 'proxy', and subsequently the destruction of 'foo'. | ||||||||
| interp.close() | ||||||||
|
|
||||||||
|
|
||||||||
| Shared object proxies support the garbage collector protocol, but will only | ||||||||
| traverse the object that they wrap if the garbage collection is occurring | ||||||||
| in the wrapped object's interpreter. To visualize: | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
| import gc | ||||||||
|
|
||||||||
| proxy = interpreters.share(object()) | ||||||||
|
|
||||||||
| # This prints out [<object object at 0x...>], because the object is owned | ||||||||
| # by this interpreter. | ||||||||
| print(gc.get_referents(proxy)) | ||||||||
|
|
||||||||
| interp = interpreters.create() | ||||||||
| interp.prepare_main(proxy=proxy) | ||||||||
|
|
||||||||
| # This prints out [], because the wrapepd object must be invisible to this | ||||||||
| # interpreter. | ||||||||
| interp.exec("import gc; print(gc.get_referents(proxy))") | ||||||||
|
|
||||||||
|
|
||||||||
| Interpreter Lifetimes | ||||||||
| ********************* | ||||||||
|
|
||||||||
| When an interpreter is destroyed, proxies wrapping objects from that | ||||||||
| interpreter may still exist elsewhere. To prevent this from causing crashes, | ||||||||
| an interpreter will invalidate all proxies pointing its objects by overwriting | ||||||||
| their wrapped object with ``None``. | ||||||||
|
|
||||||||
| To demonstrate, the following snippet first prints out ``Alive``, and then | ||||||||
| ``None`` after deleting the interpreter: | ||||||||
|
|
||||||||
| .. code-block:: python | ||||||||
|
|
||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| def test(): | ||||||||
| from concurrent import interpreters | ||||||||
|
|
||||||||
| class Test: | ||||||||
| def __str__(self): | ||||||||
| return "Alive" | ||||||||
|
|
||||||||
| return interpreters.share(Test()) | ||||||||
|
|
||||||||
| interp = interpreters.create() | ||||||||
| wrapped = interp.call(test) | ||||||||
| print(wrapped) # Alive | ||||||||
| interp.close() | ||||||||
| print(wrapped) # None | ||||||||
|
|
||||||||
| Note that the proxy is not physically replaced (``wrapped`` in the above example | ||||||||
| is still a ``SharedObjectProxy`` instance), but instead has its wrapped object | ||||||||
| replaced to ``None``. | ||||||||
|
|
||||||||
|
|
||||||||
| Backwards Compatibility | ||||||||
| ======================= | ||||||||
|
|
||||||||
| This PEP has no known backwards compatibility issues. | ||||||||
|
|
||||||||
| Security Implications | ||||||||
| ===================== | ||||||||
|
|
||||||||
| This PEP has no known backwards security implications. | ||||||||
|
|
||||||||
| How to Teach This | ||||||||
| ================= | ||||||||
|
|
||||||||
| New APIs and important information about how to use them will be added to the | ||||||||
| :mod:`concurrent.interpreters` documentation. | ||||||||
|
|
||||||||
| Reference Implementation | ||||||||
| ======================== | ||||||||
|
|
||||||||
| The reference implementation of this PEP can be found | ||||||||
| `here <https://github.com/python/cpython/compare/main...ZeroIntensity:cpython:shared-object-proxy>`_. | ||||||||
|
|
||||||||
| Rejected Ideas | ||||||||
| ============== | ||||||||
|
|
||||||||
| Directly Sharing Proxy Objects | ||||||||
| ------------------------------ | ||||||||
|
|
||||||||
| The initial revision of this proposal took an approach where an instance of | ||||||||
| :class:`~concurrent.interpreters.SharedObjectProxy` was :term:`immortal`. This | ||||||||
| allowed proxy objects to be directly shared across interpreters, because their | ||||||||
| reference count was thread-safe (since it never changed due to immortality). | ||||||||
|
|
||||||||
| This proved to make the implementation significantly more complicated, and | ||||||||
| also ended up with a lot of edge cases that would have been a burden on | ||||||||
| CPython maintainers. | ||||||||
|
|
||||||||
| Acknowledgements | ||||||||
| ================ | ||||||||
|
|
||||||||
| This PEP would not have been possible without discussion and feedback from | ||||||||
| Eric Snow, Petr Viktorin, Kirill Podoprigora, Adam Turner, and Yury Selivanov. | ||||||||
|
|
||||||||
| Copyright | ||||||||
| ========= | ||||||||
|
|
||||||||
| This document is placed in the public domain or under the | ||||||||
| CC0-1.0-Universal license, whichever is more permissive. | ||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to have a link?