feat: add DMA-BUF support #618

aws-nslick · 2024-09-21T08:16:28Z

feat: add DMA-BUF support

This adds DMA-BUF support to the plugin and enables it under the
following conditions:

At build time, libfabric>=1.20 is required (build checks for FI_MR_DMABUF).

At runtime:
 + The specific version of NCCL being used supports DMA-BUF and passes
   valid dmabuf fds to the plugin.
 + FI_HMEM must be supported.
  + For CUDA accelerators, CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED is
   queried and FI_HMEM_CUDA is requested.
  + For Neuron, we assume all nrt versions are viable of dmabuf
    export.

Libfabric as of today provides no hints in at init time that allow the
plugin to differentiate between a provider that merely has FI_HMEM
support, and one that has dmabuf support. In the case that the plugin is
built against libfabric>=1.20 but libfabric is unable to handle dmabuf
registrations, a new environment variable (OFI_NCCL_DMABUF_DISABLE=1) is
introduced to force the legacy path. When set, dmabuf support is not
advertised to nccl and this ensures that the plugin remains in the
legacy iovec path.

Testing: Various combinations of
  + OFI_NCCL_DISABLE_DMABUF=0/1
  + OFI_NCCL_PROTOCOL=RDMA/SENDRECV
  + FI_HMEM_CUDA_USE_GDRCOPY=0/1

Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

Note that currently this requires a patch for libfabric or explicitly disabling gdrcopy: aws-nslick/libfabric@a475d9140

prov/efa: entirely avoid gdrcopy on FI_MR_DMABUF
efa_mr_hmem_setup previously always called ofi_hmem_dev_register on
FI_HMEM_CUDA arguments when gdrcopy was enabled, regardless of whether
FI_MR_DMABUF was also present and gdrcopy was not to be involved. Apps
who wish to conditionally use fi_regattr with dmabuf descriptors by
passing FI_MR_DMABUF, but who reserve the right to also provide iovec
args that require pinning (provided in libfabric by libgdrcopy), find
themselves in a position of needing to explicitly disable gdrcopy to
support DMABUF prior to this patch.

When FI_MR_DMABUF flags are passed here, just exit early.

Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

check at build time for at least cuda 11.7, remove runtime checks revalidating this. Prefer cudart functions for things like checking versions for simplicity. Drop cuda_check functional test as its intention was to ensure that CUDA was not linked, but CUDA is now linked. This also needed to fix nvtx's m4 autodetection, and our github actions needed to use the actual cuda repos (previously were using ancient 11.x toolkits from ubuntu universe repos). Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

Along the interface with NCCL, continue to return -ENOTSUP, but for the internal api remove dmabuf fnptrs within the communicators and delete the impls they pointed at. Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

The MR cache needs to be capable of handling triplets of {base, offset, len} in addition to the current arguments of {base, len}. Add a tagged union with a functional interface that can represent this generically. Tagged union members are struct iovec and struct fi_mr_dmabuf, which matches the union within struct fi_mr_attr; Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

Immediately on any top-level NCCL call, construct an immutable nccl_ofi_mr_input_t on the stack, then pass that to communicator regmr implementations. Add a flags argument to internal regmr functions such that the input can be inspected and may add FI_MR_DMABUF if the input arguments correspond to a file descriptor. Implement top-level nccl_net_ofi_regMr in terms of nccl_net_ofi_regMrDmaBuf, simply forwarding arguments alongside an invalid file descriptor (-1) and a zero offset. DMA-BUF remains unsupported as of this commit, but only due to not advertising support back to NCCL/nccom. Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

This adds DMA-BUF support to the plugin and enables it under the following conditions: At build time, libfabric>=1.20 is required (build checks for FI_MR_DMABUF). At runtime: + The specific version of NCCL being used supports DMA-BUF and passes valid dmabuf fds to the plugin. + FI_HMEM must be supported. + For CUDA accelerators, CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED is queried and FI_HMEM_CUDA is requested. + For Neuron, we assume all nrt versions are viable of dmabuf export. Libfabric as of today provides no hints in at init time that allow the plugin to differentiate between a provider that merely has FI_HMEM support, and one that has dmabuf support. In the case that the plugin is built against libfabric>=1.20 but libfabric is unable to handle dmabuf registrations, a new environment variable (OFI_NCCL_DMABUF_DISABLE=1) is introduced to force the legacy path. When set, dmabuf support is not advertised to nccl and this ensures that the plugin remains in the legacy iovec path. Testing: Various combinations of + OFI_NCCL_DISABLE_DMABUF=0/1 + OFI_NCCL_PROTOCOL=RDMA/SENDRECV + FI_HMEM_CUDA_USE_GDRCOPY=0/1 Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

aws-nslick force-pushed the dmabuf branch 17 times, most recently from e99cff1 to df356b5 Compare September 22, 2024 00:26

aws-nslick added 4 commits September 21, 2024 17:34

refactor(api): delete previous DMA-BUF stub support

ab8fd91

Along the interface with NCCL, continue to return -ENOTSUP, but for the internal api remove dmabuf fnptrs within the communicators and delete the impls they pointed at. Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

aws-nslick force-pushed the dmabuf branch from df356b5 to 68ff64d Compare September 22, 2024 00:34

aws-nslick marked this pull request as ready for review September 22, 2024 00:36

aws-nslick requested review from rajachan, bwbarrett and a team as code owners September 22, 2024 00:36

aws-nslick mentioned this pull request Sep 22, 2024

fix(cuda): avoid loading stub #581

Closed

aws-nslick force-pushed the dmabuf branch from 68ff64d to 2200b81 Compare September 22, 2024 04:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add DMA-BUF support #618

feat: add DMA-BUF support #618

aws-nslick commented Sep 21, 2024 •

edited

Loading

feat: add DMA-BUF support #618

Are you sure you want to change the base?

feat: add DMA-BUF support #618

Conversation

aws-nslick commented Sep 21, 2024 • edited Loading

aws-nslick commented Sep 21, 2024 •

edited

Loading