Skip to content

Conversation

michal-shalev
Copy link
Contributor

@michal-shalev michal-shalev commented Sep 22, 2025

What?

Implement support for passing local and remote addresses to UCX device memory lists, enabling GPU-initiated transfers with pre-configured address pairs and offset-based operations, and add channel ID to the APIs.

Why?

Improve user experience by allowing GPU kernels to work with simple offsets relative to pre-configured base addresses, rather than requiring users to manage and pass absolute addresses for every transfer operation.
Adding channel ID to the device APIs would enable pre-resolved QP/channel selection at runtime, allowing users to register memory once for all experts and then dynamically choose the correct channel during execution, which reduces connection setup time and initialization overhead.

How?

  • Extended createGpuXferReq to accept local memory objects, remote rkeys, and remote addresses
  • Added getBase() method to nixlUcxMem to access local addresses
  • Added getSize() method to nixlUcxMem to access descriptor size
  • Modified createGpuXferReq implementation to populate local_addr and remote_addr fields in UCX memory list elements
  • Updated UCX backend to extract addresses from descriptors and pass them to the device memory list creation
  • Added channel_id to NIXL device APIs

Copy link

👋 Hi michal-shalev! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

Signed-off-by: Michal Shalev <mshalev@nvidia.com>
@itayalroy
Copy link

This PR is missing a change to nixlGpuSignal:

struct nixlGpuSignal {
    uint64_t inc = 0;
    uint64_t remote_addr = 0;
};

should be changed to:

struct nixlGpuSignal {
    uint64_t inc = 0;
    uint64_t offset = 0;
};

Signed-off-by: Michal Shalev <mshalev@nvidia.com>
Signed-off-by: Michal Shalev <mshalev@nvidia.com>
Signed-off-by: Michal Shalev <mshalev@nvidia.com>
@michal-shalev
Copy link
Contributor Author

/build

@michal-shalev michal-shalev requested a review from rakhmets October 6, 2025 15:57
@michal-shalev michal-shalev merged commit e5a9458 into ai-dynamo:main Oct 7, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants