-
Notifications
You must be signed in to change notification settings - Fork 872
Fix thread-safety issue in LogRecordSharedPool.Rent() #6833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix thread-safety issue in LogRecordSharedPool.Rent() #6833
Conversation
Fixes open-telemetry#6233 - Change TryRentCoreRare failure handling from 'continue' to 'break' to prevent duplicate LogRecord rentals under high contention - Optimize slotIndex calculation - Add regression test RentShouldNeverReturnSameInstanceConcurrently
|
Hi. @alanwest @open-telemetry/dotnet-approvers could you please review this? It addresses the thread-safety issue in LogRecordSharedPool (Fixes #6233). @TimothyMothra, tagging you as well since this is in the Logs/LogRecord pool area. Also, this is my first time requesting a review in this repo—if I’m not tagging the right people or there’s a preferred process, please let me know and I’ll follow it. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
martincostello
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the CHANGELOG with the bug fix details.
| if (!inUseRecords.TryAdd(record, Environment.CurrentManagedThreadId)) | ||
| { | ||
| Volatile.Write(ref duplicateDetected, true); | ||
| duplicateDetails = $"LogRecord {record.GetHashCode()} rented by thread {inUseRecords[record]} and {Environment.CurrentManagedThreadId}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bug happened again, this might overwrite a race itself. It would be better to add the message to a concurrent collection, then join its contents to construct the failure message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review, @martincostello!
I've addressed your feedback:
-
CHANGELOG updated - Added the bug fix entry under the
Unreleasedsection in src/OpenTelemetry/CHANGELOG.md -
Test race condition fixed - Changed the duplicate detection logic to use
ConcurrentQueue<string>instead of a single string variable. Now all duplicate events are collected safely and joined in the final assertion message.
Changes:
- Replaced
duplicateDetected(bool) +duplicateDetails(string) withConcurrentQueue<string> duplicateMessages - Used
TryGetValueto safely retrieve the first thread ID - Updated the assertion to
Assert.True(duplicateMessages.IsEmpty, ...)
Please let me know if there's anything else that needs to be addressed.
- Add CHANGELOG entry for thread-safety fix in LogRecordSharedPool.Rent() - Use ConcurrentQueue for duplicate detection messages in test to avoid race condition when collecting failure details"
Co-authored-by: Martin Costello <martin@martincostello.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6833 +/- ##
==========================================
- Coverage 87.17% 87.14% -0.04%
==========================================
Files 263 263
Lines 12385 12386 +1
==========================================
- Hits 10797 10794 -3
- Misses 1588 1592 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
Hi @martincostello — re-requesting review 🙇 |
Fixes #6233
Changes
This PR fixes a thread-safety issue in
LogRecordSharedPool.Rent()where the same LogRecord instance could be rented to multiple threads concurrently under high contention.continueand retry with a new index, which could lead to a duplicate rental.slotIndexcalculation to avoid redundant modulo operations.Merge requirement checklist
CHANGELOG.mdfiles updated for non-trivial changes