Skip to content

Why does cuda::atomic::store(memory_order_seq_cst) generate a relaxed store instead of a release store? #3827

Answered by gonzalobg
admbbs asked this question in libcu++
Discussion options

You must be logged in to vote

Very good question!

First, libcu++ atomics currently rely on implementation details which, in currently supported platforms, enable libcu++ to lower:

  • sequentially-consistent stores to fence.sc; st.relaxed; instead of fence.sc; st.release;.
  • sequentially-consistent rmws to fence.sc; atom.acquire; instead of fence.sc; atom.acq_rel;.
    libc++ is closely tied to the implementation (CUDA Toolkit, compiler, driver, hw) and if the above changes, we'll update it accordingly.

Second, you are totally right that the current expansion is not correct according to the model published in the ASPLOS ’19 paper, or the PTX Atomics ABI which is what we require external SW to follow. We’ve actually considered…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@admbbs
Comment options

Answer selected by admbbs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants