Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't build with Nvidia HPC SDK #563

Open
Robadob opened this issue Jun 21, 2021 · 8 comments
Open

Can't build with Nvidia HPC SDK #563

Robadob opened this issue Jun 21, 2021 · 8 comments

Comments

@Robadob
Copy link
Member

Robadob commented Jun 21, 2021

HPC SDK packages the curand headers in a separate math include directory, as of CMake 3.20. This is not added to the include path, so curand_kernel.h is not found at build time.

This also affects RTC compilation, as RTC only includes cuda/include (inside JitifyCache.cu).

@ptheywood
Copy link
Member

ptheywood commented Jun 21, 2021

CMake 3.17+ ships with a new findCUDAToolkit package which provides cmake imported targets for cuda sub-features such as nvrtc, and the cuda runtime itself (rather than just making the cuda runtime available by requiring cuda as a language)

This however does not currently work if nvhpc is the only source of nvcc/cuda, probably needs an issue opening on CMake?

@ptheywood ptheywood added this to the v2.Y.Z milestone Aug 11, 2021
@ptheywood
Copy link
Member

When using CMake 3.21 and nvhpc 21.7 with nvc++ as the host compiler, rapidjson cannot be compiled due to use of uint128. This needs further investigation, as the section of code was macro guarded behind some gcc specific looking macros.

Unlear if this version would work when using nvcc from the nvhpc install, but using g++ as the host compiler (the above issue re: curand).

Additionally, #661 makes use of findCUDAToolkit which may also improve the situiation (if findCUDAToolkit is nvhpc compatible/aware)

@ptheywood
Copy link
Member

#977 has been opened to split nvc++ host compiler support out from this issue, which is to make sure that nvcc installed via the nvhpc is viable for building FLAME GPU 2.

Originally (during a hackathon) it would not build due to the curand issues mentioned in the top post.

Since we've updated to CMAke >= 3.18, and use findCUDAToolkit hgopefully this problem is resolved.

This might just require configuring (and building) FLAME GPU 2 with the nvcc from nvhpc on the path, (i.e. moudle rm CUDA/11.x, module load nvhpc on mav/waimea) using different versions of CMake / nvhpc to find version(s) that work.

@ptheywood
Copy link
Member

ptheywood commented Nov 17, 2022

With:

  • CMake 3.22
  • NVCC 11.7 from nvhcp 22.7
  • GCC 11.3

via (on a box I've setup the module files for):

module rm CUDA
module load nvhpc
module load gcc/11

Produces the following configuration output:

cmake .. -DCUDA_ARCH=86 -DBUILD_TESTS=ON -DBUILD_SWIG_PYTHON=ON 
-- Looking for a CXX compiler
-- Looking for a CXX compiler - /home/ptheywood/bin/lmod-modules/symlinks/gcc/11/g++
-- The CXX compiler identification is GNU 11.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/ptheywood/bin/lmod-modules/symlinks/gcc/11/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvcc
-- Looking for a CUDA host compiler - /home/ptheywood/bin/lmod-modules/symlinks/gcc/11/g++
-- The CUDA compiler identification is NVIDIA 11.7.64
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1") 
-- The C compiler identification is GNU 11.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/ptheywood/bin/lmod-modules/symlinks/gcc/11/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- -----Configuring Project: flamegpu-----
-- Found CUDAToolkit: /opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include (found version "11.7.64") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found Thrust: /opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/lib64/cmake/thrust/thrust-config.cmake (found version "1.15.0.0") 
-- Found Jitify: /home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/jitify-src (found version "0.9") 
-- RapidJSON found. Headers: /home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/rapidjson-src/include
-- Setting build type to 'Release' as none was specified.
-- Found Doxygen: /usr/bin/doxygen (found version "1.9.1") found components: doxygen dia dot missing components: mscgen
-- flamegpu version 2.0.0-alpha.3+1cacd900
-- Generating Compute Capabilities: 86
-- -----Configuring Project: boids_bruteforce-----
-- -----Configuring Project: boids_spatial3D-----
-- -----Configuring Project: boids_rtc_bruteforce-----
-- -----Configuring Project: boids_rtc_spatial3D-----
-- -----Configuring Project: circles_bruteforce-----
-- -----Configuring Project: circles_spatial3D-----
-- -----Configuring Project: game_of_life-----
-- -----Configuring Project: host_functions-----
-- -----Configuring Project: ensemble-----
-- -----Configuring Project: sugarscape-----
-- -----Configuring Project: diffusion-----
-- Found Python: /usr/bin/python3.10 (found version "3.10.6") found components: Interpreter 
-- -----Configuring Project: tests-----
-- Found Python3: /usr/bin/python3.10 (found version "3.10.6") found components: Interpreter Development.Module 
-- Python found at /usr/bin/python3.10
-- Could NOT find SWIG (missing: SWIG_EXECUTABLE SWIG_DIR) (Required is at least version "4.0.2")
-- [swig] Downloading swig-4.0.2.tar.gz
-- [swig] ./configure --prefix /home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/swig-build
-- [swig] make
-- [swig] make install
-- Found SWIG: /home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/swig-build/bin/swig (found suitable version "4.0.2", minimum required is "4.0.2")  
-- -----Configuring Project: pyflamegpu-----
-- pyflamegpu-2.0.0a3 (2.0.0a3+cuda117)
-- Found python module: setuptools (version "59.6.0")
-- Found python module: wheel (version "0.37.1")
-- Found python module: build (version "0.8.0")
-- Found python module: venv (version "Unknown")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7

And successfully builds, however it emits many, many device warnings from CUB headers.

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/block/specializations/../../block/block_exchange.cuh(720): warning #1719-D: the initialization of member "cub::BlockExchange<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::lane_id" will be done before that of member "cub::BlockExchange<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::warp_id"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/../../agent/agent_histogram.cuh(705): warning #1719-D: the initialization of member "cub::AgentHistogram<AgentHistogramPolicyT, PRIVATIZED_SMEM_BINS, NUM_CHANNELS, NUM_ACTIVE_CHANNELS, SampleIteratorT, CounterT, PrivatizedDecodeOpT, OutputDecodeOpT, OffsetT, PTX_ARCH>::output_decode_op" will be done before that of member "cub::AgentHistogram<AgentHistogramPolicyT, PRIVATIZED_SMEM_BINS, NUM_CHANNELS, NUM_ACTIVE_CHANNELS, SampleIteratorT, CounterT, PrivatizedDecodeOpT, OutputDecodeOpT, OffsetT, PTX_ARCH>::privatized_decode_op"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/../../agent/agent_select_if.cuh(246): warning #1719-D: the initialization of member "cub::AgentSelectIf<AgentSelectIfPolicyT, InputIteratorT, FlagsInputIteratorT, SelectedOutputIteratorT, SelectOpT, EqualityOpT, OffsetT, KEEP_REJECTS>::d_selected_out" will be done before that of member "cub::AgentSelectIf<AgentSelectIfPolicyT, InputIteratorT, FlagsInputIteratorT, SelectedOutputIteratorT, SelectOpT, EqualityOpT, OffsetT, KEEP_REJECTS>::d_flags_in"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/dispatch_radix_sort.cuh(1708): warning #1719-D: the initialization of member "cub::DispatchSegmentedRadixSort<IS_DESCENDING, KeyT, ValueT, BeginOffsetIteratorT, EndOffsetIteratorT, OffsetT, SelectedPolicy>::stream" will be done before that of member "cub::DispatchSegmentedRadixSort<IS_DESCENDING, KeyT, ValueT, BeginOffsetIteratorT, EndOffsetIteratorT, OffsetT, SelectedPolicy>::is_overwrite_okay"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/block/specializations/../../block/block_exchange.cuh(720): warning #1719-D: the initialization of member "cub::BlockExchange<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::lane_id" will be done before that of member "cub::BlockExchange<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::warp_id"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/../../agent/agent_histogram.cuh(705): warning #1719-D: the initialization of member "cub::AgentHistogram<AgentHistogramPolicyT, PRIVATIZED_SMEM_BINS, NUM_CHANNELS, NUM_ACTIVE_CHANNELS, SampleIteratorT, CounterT, PrivatizedDecodeOpT, OutputDecodeOpT, OffsetT, PTX_ARCH>::output_decode_op" will be done before that of member "cub::AgentHistogram<AgentHistogramPolicyT, PRIVATIZED_SMEM_BINS, NUM_CHANNELS, NUM_ACTIVE_CHANNELS, SampleIteratorT, CounterT, PrivatizedDecodeOpT, OutputDecodeOpT, OffsetT, PTX_ARCH>::privatized_decode_op"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/../../agent/agent_select_if.cuh(246): warning #1719-D: the initialization of member "cub::AgentSelectIf<AgentSelectIfPolicyT, InputIteratorT, FlagsInputIteratorT, SelectedOutputIteratorT, SelectOpT, EqualityOpT, OffsetT, KEEP_REJECTS>::d_selected_out" will be done before that of member "cub::AgentSelectIf<AgentSelectIfPolicyT, InputIteratorT, FlagsInputIteratorT, SelectedOutputIteratorT, SelectOpT, EqualityOpT, OffsetT, KEEP_REJECTS>::d_flags_in"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/dispatch_radix_sort.cuh(1708): warning #1719-D: the initialization of member "cub::DispatchSegmentedRadixSort<IS_DESCENDING, KeyT, ValueT, BeginOffsetIteratorT, EndOffsetIteratorT, OffsetT, SelectedPolicy>::stream" will be done before that of member "cub::DispatchSegmentedRadixSort<IS_DESCENDING, KeyT, ValueT, BeginOffsetIteratorT, EndOffsetIteratorT, OffsetT, SelectedPolicy>::is_overwrite_okay"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/block/specializations/../../block/block_exchange.cuh(720): warning #1719-D: the initialization of member "cub::BlockExchange<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::lane_id" will be done before that of member "cub::BlockExchange<InputT, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::warp_id"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/../../agent/agent_histogram.cuh(705): warning #1719-D: the initialization of member "cub::AgentHistogram<AgentHistogramPolicyT, PRIVATIZED_SMEM_BINS, NUM_CHANNELS, NUM_ACTIVE_CHANNELS, SampleIteratorT, CounterT, PrivatizedDecodeOpT, OutputDecodeOpT, OffsetT, PTX_ARCH>::output_decode_op" will be done before that of member "cub::AgentHistogram<AgentHistogramPolicyT, PRIVATIZED_SMEM_BINS, NUM_CHANNELS, NUM_ACTIVE_CHANNELS, SampleIteratorT, CounterT, PrivatizedDecodeOpT, OutputDecodeOpT, OffsetT, PTX_ARCH>::privatized_decode_op"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/../../agent/agent_select_if.cuh(246): warning #1719-D: the initialization of member "cub::AgentSelectIf<AgentSelectIfPolicyT, InputIteratorT, FlagsInputIteratorT, SelectedOutputIteratorT, SelectOpT, EqualityOpT, OffsetT, KEEP_REJECTS>::d_selected_out" will be done before that of member "cub::AgentSelectIf<AgentSelectIfPolicyT, InputIteratorT, FlagsInputIteratorT, SelectedOutputIteratorT, SelectOpT, EqualityOpT, OffsetT, KEEP_REJECTS>::d_flags_in"

/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/11.7/include/cub/device/dispatch/dispatch_radix_sort.cuh(1708): warning #1719-D: the initialization of member "cub::DispatchSegmentedRadixSort<IS_DESCENDING, KeyT, ValueT, BeginOffsetIteratorT, EndOffsetIteratorT, OffsetT, SelectedPolicy>::stream" will be done before that of member "cub::DispatchSegmentedRadixSort<IS_DESCENDING, KeyT, ValueT, BeginOffsetIteratorT, EndOffsetIteratorT, OffsetT, SelectedPolicy>::is_overwrite_okay"

This is #define CUB_VERSION 101500, so not super old.

The compilation command for a single file is:

cd /home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/tests && /opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/home/ptheywood/bin/lmod-modules/symlinks/gcc/11/g++ -DCURAND_Philox4_32_10 -DJITIFY_PRINT_LOG -DMIN_CUDA_ARCH=86 -DSEATBELTS=1 -I/home/ptheywood/code/flamegpu/FLAMEGPU2/tests -I/home/ptheywood/code/flamegpu/FLAMEGPU2/include -isystem=/home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/jitify-src -isystem=/home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/googletest-src/googletest/include -isystem=/home/ptheywood/code/flamegpu/FLAMEGPU2/build-c3.22-nvhpcnvcc11.7/_deps/googletest-src/googletest -O3 -DNDEBUG -Xcompiler -Wall,-Wsign-compare --Wreorder -Xcudafe --display_error_number -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=declared_but_not_referenced -Xcudafe --diag_suppress=2809 -lineinfo --expt-relaxed-constexpr --threads 2 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -std=c++17 -MD -MT tests/CMakeFiles/tests.dir/test_cases/runtime/messaging/test_brute_force.cu.o -MF CMakeFiles/tests.dir/test_cases/runtime/messaging/test_brute_force.cu.o.d -x cu -dc /home/ptheywood/code/flamegpu/FLAMEGPU2/tests/test_cases/runtime/messaging/test_brute_force.cu -o CMakeFiles/tests.dir/test_cases/runtime/messaging/test_brute_force.cu.o

Which does not specify the found thrust/cub as isystem, which might be the case, so -Wreorder is triggering this? Based on a comment about not being able to use -Wreorder on windows, we might not be able to supprsess this, I'm also not sure how easy it is to differentiate between nvcc from cuda and nvcc from nvhpc via CMake.

So it does work with recent enough versions, just needs some CMAke warning suppression tweaks

It will be worth checking with older CMake (3.18) and probably older nvhpc too.

I do get test suite failure(s) for RTC, but its mostly just from the nvhpc module file not setting CUDA_PATH which we rely on to find nvcc.

unknown file: Failure
C++ exception with description "/home/ptheywood/code/flamegpu/FLAMEGPU2/src/flamegpu/util/detail/JitifyCache.cu(127): Error could not find CUDA include directory. Please specify using the CUDA_PATH environment variable" thrown in the test body.
module show nvhpc
--------------------------------------------------------------------------------------------------------------------------------------
   /opt/nvidia/hpc_sdk/modulefiles/nvhpc/22.7:
--------------------------------------------------------------------------------------------------------------------------------------
conflict("nvhpc")
conflict("nvhpc-nompi")
conflict("nvhpc-byo-compiler")
setenv("NVHPC","/opt/nvidia/hpc_sdk")
setenv("NVHPC_ROOT","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7")
setenv("CC","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvc")
setenv("CXX","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvc++")
setenv("FC","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvfortran")
setenv("F90","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvfortran")
setenv("F77","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin/nvfortran")
setenv("CPP","cpp")
prepend_path("PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/bin")
prepend_path("PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/bin")
prepend_path("PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/mpi/bin")
prepend_path("PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/extras/qd/bin")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/lib64")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/cuda/extras/CUPTI/lib64")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/extras/qd/lib")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/lib")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/math_libs/lib64")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/mpi/lib")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/nccl/lib")
prepend_path("LD_LIBRARY_PATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/nvshmem/lib")
prepend_path("CPATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/math_libs/include")
prepend_path("CPATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/mpi/include")
prepend_path("CPATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/nccl/include")
prepend_path("CPATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/nvshmem/include")
prepend_path("CPATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/extras/qd/include/qd")
prepend_path("MANPATH","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/compilers/man")
setenv("OPAL_PREFIX","/opt/nvidia/hpc_sdk/Linux_x86_64/22.7/comm_libs/mpi")

@ptheywood
Copy link
Member

ptheywood commented Nov 17, 2022

-isystem=/usr/local/cuda-11.8/include -isystem=/usr/local/cuda-11.8/include/../include (or teh equivalent for diff cuda versions) is what is present in build comamnds for regular nvcc based builds, which implicitly suppresses the warnings.

This comes from src/CMakelists.txt, through an edge case to detect CUB, and also a ../include version to prevent CMake from autoremoving the duplicate include. This section of CMake is still being triggered on config, but the -isystem comamnds are not making it through to the build commands in this case.

--diag_suppress=1719 can globally suppress these warnings, but we do want them on our code, so still need isystem really if we want reorder, or find a cmake way of detecting that the nvcc is an nvhpc install (which feels sketchy).

@ptheywood
Copy link
Member

CMake 3.18 doesn't configure successfulyl with nvhpc 22.7 install nvcc:

CMake Error at /home/ptheywood/.venvs/cmake-318/lib/python3.10/site-packages/cmake/data/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:165 (message):
  Could NOT find CUDAToolkit (missing: CUDAToolkit_INCLUDE_DIR CUDA_CUDART)
  (found version "11.7.64")

CMake 3.20 is OK with this however (3.19 ins't available by pip, so a smidge more effort to test)

@ptheywood
Copy link
Member

NVHPC 20.11, which ships with CUDA 11.1 configures and builds flamegpu2 with CMAke 3.20 as well, this is the second oldest release, and the oldest which is readily installable on ubuntu 22.04 via apt, so as old as I'll test.

I also didn't get the Wreorder warnigns, but this is due to the included thrust/cub being too old, so using one fetched via cmake instead (which is correctly marked as isystem).

E.g. the reorder warnings require CUB >= 1.14 to be distributed with the nvhpc install to occur.

@ptheywood
Copy link
Member

The root cause of these warnings has been fixed in a future CUB release by NVIDIA/cub#582, but that won't be useful to us immediatley.

@ptheywood ptheywood removed this from the v2.Y.Z milestone Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants