Skip to content

Fix free-threaded cleanup race in _replace_dunder_methods#21618

Open
Furkan-rgb wants to merge 1 commit intoLightning-AI:masterfrom
Furkan-rgb:fix/free-threaded-dunder-cleanup
Open

Fix free-threaded cleanup race in _replace_dunder_methods#21618
Furkan-rgb wants to merge 1 commit intoLightning-AI:masterfrom
Furkan-rgb:fix/free-threaded-dunder-cleanup

Conversation

@Furkan-rgb
Copy link
Copy Markdown

@Furkan-rgb Furkan-rgb commented Mar 27, 2026

What does this PR do?

Fixes a race in lightning.fabric.utilities.data._replace_dunder_methods when the cleanup path runs concurrently under free-threaded Python.

In the finally block, two overlapping callers can observe __old__* in cls.__dict__, but one thread may restore and delete the attribute before the other finishes its own getattr/setattr/delattr sequence. Under Python 3.14t with PYTHON_GIL=0, that can raise AttributeError and interrupt the training loop.

This change makes the cleanup tolerant to that race by skipping AttributeError if another thread has already restored the original dunder method. It also adds a regression test that simulates a concurrent restore during __old__delattr__ cleanup.

Why is this needed?

This affects free-threaded Python runs where Lightning patches overlapping class hierarchies from multiple threads, for example threaded hyperparameter searches.

Does this PR introduce any breaking changes?

No.

How was this tested?

  • python -m pytest tests/tests_fabric/utilities/test_data.py -k replace_dunder_methods -v
  • python -m pytest tests/tests_pytorch/utilities/test_data.py::test_custom_torch_batch_sampler -v

📚 Documentation preview 📚: https://pytorch-lightning--21618.org.readthedocs.build/en/21618/

@github-actions github-actions bot added the fabric lightning.fabric.Fabric label Mar 27, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87%. Comparing base (6805188) to head (f88aca9).
⚠️ Report is 14 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #21618   +/-   ##
=======================================
- Coverage      87%      87%   -0%     
=======================================
  Files         270      270           
  Lines       23895    23899    +4     
=======================================
+ Hits        20675    20676    +1     
- Misses       3220     3223    +3     

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fabric lightning.fabric.Fabric

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants