issue-2409: fix null allocator access in logging during shutdown when using CUSTOM_MEMORY_MANAGEMENT #2462
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
This fixes #2409.
Description of changes:
In #1996 a change was made to modify the lifetime of the loggers such that they outlive the CRT. When one is using CUSTOM_MEMORY_MANAGEMENT, when the CRT is cleaned up, the allocator is as well. Should there be any threads still using resources, such as the loggers that outlive the CRT, we can hit an assertion in aws-c-common indicating the allocator is null under certain conditions.
Restoring the correct ShutdownAPI order (the exact opposite of InitAPI's sequence) resolves #2409 but re-introduces the root cause that led to #1996 in the first place: #1995. Due to a lack of thread safety in the aws-c-common logging methods, there is no way for us to reliably uninstall the CRT logger and then clean it up, because it might be in use by one of the CRT subsystems like event_loop. For example, see the following output from drd:
Without underlying support for thread safety, one way to fix this would require an interface change. CRTLogSystemInterface could gain a ShutdownLogging() method, which unhooks the logger from aws_c_common, but does not clean it up. During ShutdownAPI, the loggers would be shutdown, then CRT cleaned up, and now that there are no consumers of the logging subsystem, it would be safe to destroy the global common logger by destructing DefaultCRTLogSystem after CleanupCRT.
Avoiding an interface change, I took another approach. After unhooking DefaultCRTLogSystem from the aws-c-common global logging subsystem, there could be threads that have acquired the logger pointer but have not used it yet. It's ugly, and it's not guaranteed, but a 25ms delay after setting the logger to NULL allows other threads to do their logging business and go on their merry way, making it safe(r) for DefaultCRTLogSystem to clean up m_logger. I'll fully admit that it's a hack, but it's a hack that allowed my team to make forward progress - at least until someone wants to muck with the
CRTLogSystemInterface.
Check all that applies:
Check which platforms you have built SDK on to verify the correctness of this PR.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.