-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove cu::Context class #316
base: main
Are you sure you want to change the base?
Conversation
I do not think that I like it. It is a major deviation from the idea to stay close to the CUDA driver API. Also, it breaks basically all the libraries and applications that we have. |
Going forward, I see a couple of options. In no particular order:
For 1. and 2., we would also have to revert the deprecation message to improve the user experience. I would be strongly in favour of option 3. or 4. Option 3. maintains compatibility with existing code, but could also confuse the user. People arguably shouldn't use the There is one thing that still may be need to be addressed for 3. or 4., the |
If we leave it as is we should print something during compilation when using HIP. The context functions are doing nothing... |
Can't we just make them noops for AMD only? Can a Context::setCurrent() be safely ignored on AMD GPUs? |
The Context Management is now done in the constructor of the cu::Device class.
To this end, even when the primary context is retained, the returned Cucontext object must be stored in the _context_manager.
With HIP on an AMD GPU, allocating CU_MEMORYTYPE_UNIFIED has type CU_MEMORYTYPE_HOST, causing the check to fail. It is weird that this didn't cause problems before changing the Context class. Alternatively, we could query and store the memory type in the constuctor.
for more information, see https://pre-commit.ci
and @john-romein
Can you check the latest code to see how I solved it? There is no need for no-ops. Both in CUDA and HIP mode, you can use the same The only difference between the two is that CUDA will complain about an invalid device context when you try to use the device without calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of minor code suggestions. I agree that the flood of deprecated messages is frustrating, so I support finding a better solution as soon as possible.
I do agree with @john-romein that these changes will have a significant impact, maybe even warranting a release bump to 1.x.x
. Based on that, I have two suggestions:
- Delay the changes to a later version of cudawrappers and first release
0.9.0
, either with or without the deprecated messages. - Implement the changes now in cudawrappers version
0.9.0
, but temporarily include an emptycu::Context
to maintain backward compatibility. Internally, it may forward the method call to the representativecu::Device
methods. This would give cudawrappers users a grace period to adapt.
int _ordinal; | ||
}; | ||
|
||
class Context : public Wrapper<CUcontext> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't some of these functions be implemented inside the Device
class? For example: setLimit
and getLimit
? In the case they will not be included, I would suggest mentioning it in the changelog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if anyone is really using these functions. They are not available with HIP, are they? I think it's sufficient to inform the user about cu::Context
being removed in the changelog.
@@ -208,134 +230,10 @@ class Device : public Wrapper<CUdevice> { | |||
int getOrdinal() const { return _ordinal; } | |||
|
|||
private: | |||
std::shared_ptr<CUcontext> _context_manager; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a std::shared_ptr
? An std::unique_ptr
should be sufficient here, especially as Device
will always maintain ownership. The only use case I see is when a Device
instance is passed as a copy instead of a reference, but I don't know if this is done in practice. In my view, a reference should be preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done for consistency with the Wrapper
class, which also uses std::shared_ptr
. I think that this is the 'safe' option, but if you insist I am also fine with changing it to std::unique_ptr
.
checkCudaCall(cuDevicePrimaryCtxGetState(_obj, &flags, &active)); | ||
if (active) { | ||
manager = | ||
std::shared_ptr<CUdevice>(new CUdevice(_obj), [](CUdevice *ptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion: use the shorthand std::make_shared
(or std::make_unique
, see my other comment) to omit the templated argumentation.
thread.join(); | ||
} | ||
|
||
TEST_CASE("Test DeviceMemory with Device::ctxSetCurrent", "[context]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good test case, I think this one deals with the issue we were experiencing with dedisp
.
Remove the
cu::Context
class. The Context Management is now done in the constructor of the cu::Device class.When a Primary Context was already active, e.g. when combining cudawrappers with the NVIDIA Runtime API, that context is retained. If not, a new Context is created.
Some tests had to be removed, while others were adapted slightly. When building in HIP mode, the Context code is disabled. The code is tested locally on NVIDIA en AMD en all pass.
Description
Related issues:
Instructions to review the pull request