Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NDEBUG flag breaks binary compatibility, which can cause mysterious crashes #272

Open
kosak opened this issue Sep 24, 2023 · 5 comments
Open

Comments

@kosak
Copy link

kosak commented Sep 24, 2023

Code in immer/config.hpp causes IMMER_TAGGED_NODE to be set to 0 or 1 according to whether NDEBUG is defined. This in turn affects the definition of immer::detail::rbts::node; in particular whether or not impl_data_t contains a kind_t kind field.

This can cause problems if user code compiled with NDEBUG set is linked to other code compiled without NDEBUG, because the Immer data structures are not binary compatible between these two versions.

This situation can arise in a complicated build system where libraries are built by different build steps. Even though ideally one should try to have a coherent set of flags in all cases, I think people expect NDEBUG to be harmless, at most controlling whether assertions are enabled or perhaps turning on some log messages. In my opinion it would be surprising to learn that NDEBUG would break binary compatibility. In my company's case we had Rcpp building some of our code and it used an NDEBUG flag in its build steps.

In the example below, the code in main.cpp doesn't even call the code in util.cpp, but util.cpp poisons the binary, causing main.cpp to crash. Steps to reproduce:

// main.cpp
#include "immer/vector.hpp"

int main() {
  immer::vector<int> v0;
  for (size_t i = 0; i < 100; ++i) {
    v0 = v0.push_back(13);
  }
}
// util.cpp
#include "immer/vector.hpp"

void notcalled(const immer::vector<int> &v0) {
  (void)v0[0];
}

Command:

g++ -c -I. -DNDEBUG main.cpp
g++ -c -I. util.cpp
g++ -o doit util.o main.o 
./doit
doit: ./immer/detail/rbts/node.hpp:175: immer::detail::rbts::node<T, MemoryPolicy, B, BL>::node_t** immer::detail::rbts::node<T, MemoryPolicy, B, BL>::inner() [with T = int; MemoryPolicy = immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap>, immer::refcount_policy, immer::spinlock_policy>; unsigned int B = 5; unsigned int BL = 6; immer::detail::rbts::node<T, MemoryPolicy, B, BL>::node_t = immer::detail::rbts::node<int, immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap>, immer::refcount_policy, immer::spinlock_policy>, 5, 6>]: Assertion `kind() == kind_t::inner' failed.
Aborted (core dumped)

This can be a tricky thing to diagnose and debug. In my opinion, Immer ought to do one of two things. Either:

  1. NDEBUG should not control IMMER_TAGGED_NODE (nor IMMER_ENABLE_DEBUG_SIZE_HEAP); if the programmer wants these behaviors they should have to set those flags explicitly and not get them via NDEBUG.

or:

  1. These flags should change the namespace or classnames of all affected code (i.e. both the node data structure and any code referencing the node data structure). This would have the nice property that the two versions could coexist side-by-side and in the worst case an inconsistency would lead to a link error rather than a mysterious crash.

If you agree, I would be happy to contribute code for item 1 or 2, once I know what your preference is.

@arximboldi
Copy link
Owner

Hi @kosak!

Good find. I think you're right. I'm happy with option 1. When you make the PR, maybe you can add a CMake flag (-DImmer_DEBUG_TOOLS) or alike to enable those easily during development? Also remember to enable it in test.yml so in Debug builds that in CI it is still enabled.

@jeaye
Copy link

jeaye commented Jul 12, 2024

200

Two full days I've spent narrowing down a test case in jank, thinking this was due to UB in Clang 19's JIT compilation. After ruling everything else out, I could only reproduce the issue with immer containers. I didn't spot the issue myself, and I wasn't close to doing so. Fortunately I did take a look at the open issues, though.

@kosak, you're a saint. Thanks for finding this and making the issue. For me, the way it showed up is that jank is compiled with optimizations for release builds, but it doesn't enable optimizations by default for JIT compiled code (those functions can be later optimized on demand).

I can work around this for now by just defining IMMER_TAGGED_NODE myself, both in AOT and JIT compiled code. But I agree that having ABI differences dependent on NDEBUG is gobbling up sanity for other people around the world right now too.

@kosak
Copy link
Author

kosak commented Jul 12, 2024

Thank you for the kind words. I'm so glad it helped you! Regarding "saint", recently I've been feeling pretty guilty for not following up with a fix. I could finally put something together or if someone else wants to do that, that's fine too.

@jeaye
Copy link

jeaye commented Jul 13, 2024

The fix is all yours, if you're up for it, my dude. 🙇

@arximboldi
Copy link
Owner

@kosak a fix would be very welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants