Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSAN error when accessing default_thread_local_writer #131

Open
andrewkcorcoran opened this issue Jul 27, 2021 · 10 comments
Open

TSAN error when accessing default_thread_local_writer #131

andrewkcorcoran opened this issue Jul 27, 2021 · 10 comments

Comments

@andrewkcorcoran
Copy link

The below minimal example consistently fails with a TSAN error. The original use case was when trying to change the name and id of the default thread local writer to match internal thread names/ids.

#include <thread>
#include <vector>

#include "binlog/binlog.hpp"

int main()
{
    auto v = std::vector<std::thread>{};
    for(int i = 0; i < 10; ++i)
    {
        v.emplace_back(std::thread{[](){
              binlog::default_thread_local_writer().setId(42);
                }});
    }
    for(auto & t : v)
    {
        t.join();
    }
    return 0;
}
@andrewkcorcoran
Copy link
Author

Adding the below snippet to https://github.com/morganstanley/binlog/blob/main/include/binlog/default_session.hpp#L17 fixes the issue but I can't say I fully understand why as at first glance the code in that method looks thread safe to me.

inline std::string this_thread_id_string() {
   static auto m = std::mutex{};
   auto lock = std::scoped_lock{m};
....
}

@erenon
Copy link
Contributor

erenon commented Aug 2, 2021

Hi, thanks for the report. I'm unable to reproduce. Here's what I tried:

$ /usr/bin/c++ --version
c++ (GCC) 11.1.0
$ cat ../../test/integration/ConcurrentSetId.cpp
#include <thread>
#include <vector>

#include <binlog/binlog.hpp>

int main()
{
  auto v = std::vector<std::thread>{};
  for(int i = 0; i < 10; ++i)
  {
    v.emplace_back(std::thread{[](){
        binlog::default_thread_local_writer().setId(42);
        }});
  }
  for(auto & t : v)
  {
    t.join();
  }
  return 0;
}
$ cmake ../.. -DBINLOG_USE_TSAN=On
$ make ConcurrentSetId VERBOSE=1 -B
...
[ 50%] Building CXX object CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o
/usr/bin/c++  -I/home/erenon/src/binlog/include -O2 -g -DNDEBUG -fsanitize=thread -Wall -Wextra -Werror -pedantic -Wconversion -Wsign-conversion -Wold-style-cast -Wsuggest-override -Wshadow -std=c++14 -MD -MT CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o -MF CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o.d -o CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o -c /home/erenon/src/binlog/test/integration/ConcurrentSetId.cpp
[100%] Linking CXX executable ConcurrentSetId
/usr/bin/cmake -E cmake_link_script CMakeFiles/ConcurrentSetId.dir/link.txt --verbose=1
/usr/bin/c++ -O2 -g -DNDEBUG -fsanitize=thread CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o -o ConcurrentSetId 
...
$ ./ConcurrentSetId 
$ echo $?
0

@andrewkcorcoran
Copy link
Author

Interesting.

I can't duplicate on either gcc or clang when adding my sample code into the binlog integration tests versus what was working before (having it as a standalone binary using binlog as an library).

If I add the line binlog::default_thread_local_writer().setName("test"); I can recreate the error in clang (but not gcc). Possibly related - I had to disable https://github.com/morganstanley/binlog/blob/main/cmake/ThreadSanitizer.cmake#L23 as it was giving me a false return re. clangs support for thread sanitizer?

@andrewkcorcoran
Copy link
Author

$ git diff --staged
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 859229b..828eedc 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -286,6 +286,15 @@ add_example(TscClock)

   message(STATUS "Build integration tests")

+  add_executable(ConcurrentSetId test/integration/ConcurrentSetId.cpp)
+    target_compile_definitions(ConcurrentSetId PRIVATE
+      DOCTEST_CONFIG_SUPER_FAST_ASSERTS
+      DOCTEST_CONFIG_NO_MULTI_LANE_ATOMICS
+    )
+    target_link_libraries(ConcurrentSetId headers)
+    target_include_directories(ConcurrentSetId SYSTEM PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/test) # for doctest/doctest.h
+    set_target_properties(ConcurrentSetId PROPERTIES CXX_CLANG_TIDY "")
+
   add_executable(IntegrationTest test/integration/IntegrationTest.cpp)
     target_compile_definitions(IntegrationTest PRIVATE
       DOCTEST_CONFIG_SUPER_FAST_ASSERTS
diff --git a/cmake/ThreadSanitizer.cmake b/cmake/ThreadSanitizer.cmake
index c9d340e..0256947 100644
--- a/cmake/ThreadSanitizer.cmake
+++ b/cmake/ThreadSanitizer.cmake
@@ -20,7 +20,8 @@ if (BINLOG_USE_TSAN)
     add_compile_options("-fsanitize=thread" "-static-libtsan")
     add_link_options("-fsanitize=thread" "-static-libtsan")
   else ()
-    message(FATAL_ERROR "Thread Sanitizer requested by BINLOG_USE_TSAN, but appears to be not supported on this platform")
+    add_compile_options("-fsanitize=thread")
+    add_link_options("-fsanitize=thread")
   endif ()

   set(MEMORYCHECK_TYPE ThreadSanitizer)
diff --git a/test/integration/ConcurrentSetId.cpp b/test/integration/ConcurrentSetId.cpp
new file mode 100644
index 0000000..9a91c14
--- /dev/null
+++ b/test/integration/ConcurrentSetId.cpp
@@ -0,0 +1,21 @@
+#include <thread>
+#include <vector>
+
+#include <binlog/binlog.hpp>
+
+int main()
+{
+       auto v = std::vector<std::thread>{};
+       for(int i = 0; i < 10; ++i)
+       {
+               v.emplace_back(std::thread{[](){
+                               binlog::default_thread_local_writer().setId(42);
+                               binlog::default_thread_local_writer().setName("test");
+                               }});
+       }
+       for(auto & t : v)
+       {
+               t.join();
+       }
+       return 0;
+}
$ gcc --version
gcc (GCC) 10.2.1 20200804 (Red Hat 10.2.1-2)
$ clang --version
clang version 10.0.1 (Red Hat 10.0.1-4.el7)

@erenon
Copy link
Contributor

erenon commented Aug 3, 2021

cmake issue: it seems that recent setups require -fsanitize=thread passed to the linker as well, here's the fix:

$ git diff cmake/         
diff --git a/cmake/ThreadSanitizer.cmake b/cmake/ThreadSanitizer.cmake
index c9d340e..fbbc187 100644
--- a/cmake/ThreadSanitizer.cmake
+++ b/cmake/ThreadSanitizer.cmake
@@ -5,7 +5,11 @@ option(BINLOG_USE_TSAN "Build with Thread Sanitizer (TSAN) enabled" OFF)
 
 if (BINLOG_USE_TSAN)
 
-  check_cxx_compiler_flag("-fsanitize=thread" HAS_TSAN)
+  CMAKE_PUSH_CHECK_STATE(RESET)
+    # Make check_cxx_compiler_flag pass required flags to linker as well:
+    set(CMAKE_REQUIRED_FLAGS "-fsanitize=thread")
+    check_cxx_compiler_flag("-fsanitize=thread" HAS_TSAN)
+  CMAKE_POP_CHECK_STATE()
 
   CMAKE_PUSH_CHECK_STATE(RESET)
     # Make check_cxx_compiler_flag pass required flags to linker as well:

Even if I add binlog::default_thread_local_writer().setName("test");, I'm unable to repro (tried both clang and gcc):

$ gcc --version
gcc (GCC) 11.1.0
$ clang --version
clang version 12.0.0

@andrewkcorcoran
Copy link
Author

I guess the only other difference I can see is the compiler version - perhaps a false positive in clang 10? I don't currently have access to clang 11 - are you able to downgrade to 10 and see if you can reproduce?

@erenon
Copy link
Contributor

erenon commented Aug 3, 2021

I can reproduce it with clang 10 and clang 11, but not with clang 12.

@erenon
Copy link
Contributor

erenon commented Aug 3, 2021

FTR, the race is on:

#0 memcmp
#1 std::ctype<char>::_M_widen_init
#2 binlog::default_thread_local_writer() 

Frame 2 creates a std::string, in the constructor of the function local static. I'm not sure if this is a real issue, or a false positive. I'm trying to minimize it (reproduce without binlog), unfortunately, the report is not deterministic: sometimes it throws a warning, sometimes it does not.

@andrewkcorcoran
Copy link
Author

Any update on this? We've moved to clang 12 and gcc 11 and this error is occurring on both regularly (though not 100%)

@erenon
Copy link
Contributor

erenon commented Mar 2, 2022

hi, no update. I wasn't able to reproduce it on clang 13, and discarded this as a clang/tsan bug. The short piece of code I was able to repro the issue on older clangs (10, 11, 12) seemed perfectly fine to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants