SEGFAULTs in Aws::Iotsecuretunneling::SecureTunnel internals #669
Labels
bug
This issue is a bug.
p2
This is a standard priority issue
pending-release
This issue will be fixed by an approved PR that hasn't been released yet.
Describe the bug
Sporadic SEGFAULTs in
Aws::Iotsecuretunneling::SecureTunnel
internal routines during start/stop connection to invalid hostname.The test case is simple:
data.region.fake
;Start()
SecureTunnel instance;OnConnectionFailure
callback aboutAWS_IO_DNS_INVALID_NAME
orAWS_IO_DNS_QUERY_FAILED
error;Stop()
SecureTunnel instance;OnStopped
callback;The test application may crash in different
lib-aws-c-*.so
routines.Expected Behavior
The process should not crash by SEGFAULT during/after invoking SecureTunnel::Start() SecureTunnel::Stop() methods.
Current Behavior
Aws::Iotsecuretunneling::SecureTunnel
may crash with SEGFAULT in internal routines on creation or start/stop connection to invalid hostname (low reproducibility). In theDebug
build the current code triggers assertions inlib-aws-c-*.so
libs:Crash 1: corrupted linked list in AwsEventLoop threads:
At the assertion in
AwsEventLoop 2
, themain
thread has calledStop()
method and waitsOnStopped
notification in callback by a promise:The detailed GDB backtrace per threads
Crash 2: related to DNS resolving thread:
The detailed GDB backtrace per threads
Reproduction Steps
There is small test based on
samples/secure_tunneling/secure_tunnel
which can reproduce these crashes:https://github.com/pkarneliuk/aws-iot-device-sdk-cpp-v2/blob/main/samples/secure_tunneling/secure_tunnel/main.cpp
Possible Solution
It looks like SEGFAULTs (and assertions in Debug) caused by missing thread synchronization in
libaws-c-io.so
andlibaws-c-common.so
so data structures are corrupted.The assertion
"aws_event_loop_thread_is_callers_thread(event_loop)"
during DNS resolving may be caused by scheduling an event task to unexpected thread as some default behavior.Additional Information/Context
The Ubuntu 20.04.6 LTS x86_64 host has 8 cores, so there are 4 threads in internal default
Aws::Crt::Io::EventLoopGroup
instance.SDK version used
1.31.0
Environment details (OS name and version, etc.)
Ubuntu 20.04.6 LTS on x86_64 (8 cores)
The text was updated successfully, but these errors were encountered: