Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat CudaRansac] Parallelize final mask computation. Use shared variables instead of global variables #30

Merged
merged 3 commits into from
Apr 6, 2024

Conversation

true-real-michael
Copy link
Collaborator

@true-real-michael true-real-michael commented Apr 5, 2024

This PR speeds up RANSAC by 1) parallelising result mask computation, 2) using local variables that are shared between threads instead of global variables.

  1. Currently the result of the kernel (the result mask) is written by only one thread. The mask may be large, so it can be slow. In this PR the thread with the best plane writes the plane to the shared memory and all of the threads compute the mask based on the distance from points to the plane.
  2. It is inefficient to use global CUDA memory, as it is slower than shared memory (it is shared between threads of the same block). The variables mask_mutex and max_inliers_number_cuda are no longer passed to CUDA global memory, and are replaced with variables in the shared memory.

Processing time comparison:

Clouds  Current (s)    New (s)
10       1.205917     1.126165
20       2.098319     1.914480
30       2.969012     2.602244
50       4.742269     3.968544
80       7.056089     5.925835
100      9.202390     7.302907

@true-real-michael true-real-michael changed the title [feat CudaRansac] Parallelize final mask computation among the threads of a block [feat CudaRansac] Parallelize final mask computation. Use shared variables instead of global variables Apr 5, 2024
@true-real-michael true-real-michael self-assigned this Apr 5, 2024
@true-real-michael true-real-michael merged commit 37d554a into main Apr 6, 2024
4 checks passed
@true-real-michael true-real-michael deleted the cuda-ransac/parallel-distance-calculation branch April 6, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants