Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race in PyUnicode_InternFromString under free-threading #128137

Closed
hawkinsp opened this issue Dec 20, 2024 · 2 comments
Closed

Race in PyUnicode_InternFromString under free-threading #128137

hawkinsp opened this issue Dec 20, 2024 · 2 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@hawkinsp
Copy link
Contributor

hawkinsp commented Dec 20, 2024

Bug report

Bug description:

Here's a race reported by thread sanitizer that I haven't been able to find a small reproducer for, but it does look racy reading the code.

WARNING: ThreadSanitizer: data race (pid=1575489)
  Read of size 4 at 0x7fb14614ee00 by thread T130:
    #0 unicode_eq /usr/local/google/home/phawkins/p/cpython/Objects/stringlib/eq.h:13:9 (python3.13+0x25da7a) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #1 compare_unicode_unicode /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1139:50 (python3.13+0x25da7a)
    #2 do_lookup /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1063:23 (python3.13+0x25da7a)
    #3 unicodekeys_lookup_unicode /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1148:12 (python3.13+0x25da7a)
    #4 _Py_dict_lookup /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1259:22 (python3.13+0x25ea02) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #5 dict_setdefault_ref_lock_held /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:4282:21 (python3.13+0x26a5da) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #6 PyDict_SetDefaultRef /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:4332:11 (python3.13+0x26a221) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #7 intern_common /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15225:19 (python3.13+0x34a02c) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #8 _PyUnicode_InternMortal /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15286:10 (python3.13+0x34a40c) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #9 PyUnicode_InternFromString /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15322:5 (python3.13+0x34a40c)
    #10 nanobind::detail::nb_type_new(nanobind::detail::type_init_data const*) nb_type.cpp (xla_extension.so+0xdcd28aa) (BuildId: e484e79ecc5a6e10)
...

  Previous write of size 4 at 0x7fb14614ee00 by thread T137:
    #0 immortalize_interned /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15141:34 (python3.13+0x34a20e) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #1 intern_common /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15270:9 (python3.13+0x34a20e)
    #2 _PyUnicode_InternMortal /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15286:10 (python3.13+0x34a40c) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #3 PyUnicode_InternFromString /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15322:5 (python3.13+0x34a40c)
    #4 nanobind::detail::nb_type_new(nanobind::detail::type_init_data const*) nb_type.cpp (xla_extension.so+0xdcd28aa) (BuildId: e484e79ecc5a6e10)
...

I think the scenario here is:

  • thread A and B are simultaneously interning strings
  • thread A succeeds at inserting that string into the intern dictionary, and is at the end of intern_common immortalizing the string, which sets the .interned field on the string
  • thread B is now trying to intern a string and is performing an equality test during the intern dictionary lookup, which reads the .kind field.

The .kind and .interned fields are bitfields in the same word, so this is a race, and I can't see any synchronization or atomicity that would prevent it.

Perhaps we need to hold the critical section on the intern dictionary longer, until immortalization is complete?

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

@hawkinsp hawkinsp added the type-bug An unexpected behavior, bug, or error label Dec 20, 2024
@picnixz picnixz added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API labels Dec 20, 2024
@colesbury
Copy link
Contributor

colesbury commented Dec 21, 2024

We want to be able to read interned without holding a lock. Atomics don't mix well with bitfields, so I think the easiest thing would be to move interned out of the bitfield into its own byte and shrink the padding bits. The total size shouldn't need to change.

In summary:

  • The interned field, which can change if/when the string object is interned, gets its own byte so that it can be read atomically without a lock
  • The remaining bitfields (kind, compact, ascii, statically_allocated) are effectively immutable

@corona10
Copy link
Member

corona10 commented Jan 5, 2025

@hawkinsp ae23a01 is now merged, so it should be okay at 3.14, but we are not going to backport this patch into 3.13 since it will raise compatibility issue who use Python 3.13 with GIL and free-threaded build.

See: #128196 (comment)

@corona10 corona10 closed this as completed Jan 5, 2025
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this issue Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants