Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering can now be performed on the GPU #32

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

thwjoy
Copy link

@thwjoy thwjoy commented May 10, 2018

Filtering is now done on either the CPU or GPU depending on which template argument uses in the crfasrnn op. We have two partially specialized functors (one for GPU and one for CPU), which call their respective implementations. There seems to be some issue with using CUDA through tensorflow (tensorflow/tensorflow#18441) so we now directly link against cuda for the g++ stage and use some cuda macors to check results.

I have only tested that the output is the same as when run on CPU, and I only checked it with CUDA 9 and Tensorflow 1.7.

thwjoy added 2 commits May 9, 2018 12:27
…elow commits

Calling filtering from functor

Moving away from tensorflow::Tensors, this causes problems when using GPU

Adding functionality to makefile and removing .cc.cul files

Implemented templated specialization of functors, also moved away from the tensorflow CUDA macros and defined my own, we have also reverted back to Tensors

Fixing issue with registering GPU op when no gpu is present

Compiling modified permutohedral.cu, getting an error with invalid pointer

Now filtering on GPU

Fixing channel error in filter operation and removing std::couts

Updates to readme

Tidying up before PR
@sadeepj
Copy link
Owner

sadeepj commented May 26, 2018

@thwjoy Many thanks for submitting the pull request. Sorry I took so long to test it.

After making some changes to the Makefile I was able to compile & link successfully on Tensorflow 1.4 and CUDA 8.0. However, when I ran the demo code, got the following error:

2018-05-26 18:27:56.442704: E tensorflow/stream_executor/cuda/cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2018-05-26 18:27:56.442732: E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-05-26 18:27:56.442752: F tensorflow/core/kernels/conv_ops.cc:667] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
out of memoryout of memoryinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentan illegal memory access was encounteredan illegal memory access was encounteredan illegal memory access was encounteredan illegal memory access was encounteredan illegal memory access was encounteredan illegal memory access was encounteredan illegal memory access was encounteredAborted (core dumped)

Do you happen to know what could have gone wrong?

In any case, for now, I have merged your pull request into a new branch of the main repo and documented the details in the main README file. Once I manage to fully test your implementation, I'll merge it with the master branch.

@netw0rkf10w
Copy link

@thwjoy Any updates on this, please?
I got the same errors as @sadeepj :(

out of memoryCuda kernel failed. Error: out of memoryinvalid argumentinvalid argumentCuda kernel failed. Error: invalid argumentinvalid argumentinvalid argumentCuda kernel failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants