Skip to content

[Question]: Why does nvshmem_fence default to using nvshmemi_fence<NVSHMEMI_THREADGROUP_THREAD>()? #61

@kwu130

Description

@kwu130

Question

Does using NVSHMEMI_THREADGROUP_THREAD as the default scope cause excessive redundant work? Specifically, when nvshmem_fence() is called from a warp or block, all threads execute nvshmemi_ibgda_fence(), each seeing index_in_scope == 0 and scope_size == 1, and thus redundantly iterating over all DCIs and RC QPs to issue ibgda_quiet(qp) calls. Could this lead to significant performance overhead?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions