-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build #127737
gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build #127737
Conversation
To enable BINARY_SUBSCR_GETITEM, we should handle TypeObject and _spec_cache which are mutable by other threads. But I am not sure this approach is proper approach since I noticed that cached items can be mutated before the interpreter gets cached objects, so I have added the logic to check whether the item is NULL or FunctionObject, but please let me know if my approach is wrong. Anyway, this is the first working version without error. PTAL. |
This reverts commit a059e68.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I left a few comments inline.
Python/bytecodes.c
Outdated
@@ -881,8 +880,8 @@ dummy_func( | |||
|
|||
op(_BINARY_SUBSCR_INIT_CALL, (container, sub -- new_frame: _PyInterpreterFrame* )) { | |||
PyTypeObject *tp = Py_TYPE(PyStackRef_AsPyObjectBorrow(container)); | |||
PyHeapTypeObject *ht = (PyHeapTypeObject *)tp; | |||
PyObject *getitem = ht->_spec_cache.getitem; | |||
PyObject *getitem = _PyType_GetItemFromCache(tp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably safe to reload getitem from the cache after the guard passes. (If the getitem in the cache is replaced after the guard passes, the new getitem cannot be mutated in a way that would change the function version: that would require stopping the world). However, I'd prefer that we pass the getitem that is loaded in _BINARY_SUBSCR_CHECK_FUNC
to _BINARY_SUBSCR_INIT_CALL
.
@mpage Hmm would you like to take a look again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall approach is looking good! I'd still prefer it if we passed the getitem
that is loaded in _BINARY_SUBSCR_CHECK_FUNC
to _BINARY_SUBSCR_INIT_CALL
. That would also allow us to avoid introducing the DEOPT_IF
in _BINARY_SUBSCR_INIT_CALL
. Is there a reason you don't want to do that?
Oops - fixed! |
Done :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I noticed one last thing then I think we're good to go.
We need to update the opcode definition of _BINARY_SUBSCR_INIT_CALL
in Python/optimizer_bytecodes.c
to handle the extra output (getitem
) of _BINARY_SUBSCR_CHECK_FUNC
.
Done! Even if I am not sure that this change is the correct for tier 2 because I have never handled this before :) |
The tier2 change is correct. I should add an automatic script that checks tier2 and tier1 definitions are synced so that you don't trip over this. Thanks Matt for catching that! |
@mpage Okay to merge this PR? |
I kicked off a couple of benchmark runs earlier today. They should be done shortly. Let’s wait for the results. Otherwise, looks great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
|
|
|
|
I don't think a 0.3-0.5% slower on average with a 13% slowdown on the Could we revert this PR until we have a better understanding of what is going on here? |
The How about you run the benchmarks in the faster-cpython infrastructure and see if you get the same results? |
--disable-gil
builds #115999