Skip to content

Commit ff72c38

Browse files
Naraendanolmoonenmfepex-rzrSnektron
authored
Develop stream 2024-07-03 (#431)
* windows ci * use the exact rocprim partition function and parameters for correct temporary memory size * Enabling upstream test suite by default * Using rocprim::uninitialized_array in set_operations * Use return value of hip_rocprim::synchronize to fix warning * Extend check-copyright.sh with support of © symbols in copyright * Updated changelog * Remove outdated CUDA dependencies (cub and libcudacxx submodules) https://github.com/NVIDIA/cub and https://github.com/NVIDIA/libcudacxx have been archived and may not contain the recent changes required for the current implementation of CUDA backend. * Fix invoke_result_t on C++14 and OMP (use std::result_of_t instead of invoke_result_t) * Port changes from upstream CCCL/thrust 2.3.2 * Remove unused THRUST_ROCPRIM_NS_PREFIX/POSTFIX Also the explicit `namespace thrust {` fails in thrust/testing/cmake/check_source_files.cmake * Remove THRUST_CPP_DIALECT >= 2014 from hip backend (c++14 is rocPRIM's min requirement) * Fix check_source_files.cmake validation failures 1. #include <thrust/detail/config.h> is required when macros like THRUST_NAMESPACE_BEGIN are used 2. #include <thrust/detail/memory_wrapper.h> must be used instead of #include <memory> * docs: Fix parsing and rendering of functions with __device__ and __host__ attribs * docs: Fix incorrect group names and formatting errors * docs: Fix complex and functor types * Extend CI with tests from CCCL for from CUDA and OMP backends * docs: Add type traits and c++17 parts to docs, add missing comments * docs: Improve structure with chapters * test: fix invalid arguments being passed to number distrubutions in tests * add par_det policy to use deterministic versions of algorithms * ci(.gitlab-ci.yml): split multi-target cmake job into matrix job and add hardened assertions build flag to test build * docs: update changelog * ci(.gitlab-ci.yml): fix 'unary operator expected' error in ci script * ci(.gitlab-ci.yml): replace 'ROCM_PATH' variable with 'env:HIP_PATH' as the former is unintuitive * Resolve "Add ParallelSTL to rocthrust" * Extracted cmake functionality to common CMake module * Added CMake files for new benchmarks folder * Added Google benchmark as dependency * Removed unused timers from internal benchmarks * Added benchmark utils * Added new rocThrust benchmarks * Added cmdparser * Added benchmark CI stage * Removed repetitions and fixed name_format argument * Reduced minimum execution time * Fixed host_t execution policy * Modified run() so that input and output are passed by reference * Fixed generation method to match CCCL's * Reworked gen_key_segments * Update tests from upstream cccl(2.3.2) - disabled some flaky tests: tracking issue #516 - Unlike upstream cccl use PRId64 instead of lld * chore: bump version --------- Co-authored-by: Nol Moonen <nol@streamhpc.com> Co-authored-by: Lőrinc Serfőző <lorinc@streamhpc.com> Co-authored-by: Anton Gorenko <anton@streamhpc.com> Co-authored-by: Robin Voetter <robin@streamhpc.com> Co-authored-by: Nick Breed <nick@streamhpc.com> Co-authored-by: Beatriz Navidad Vilches <beatriz@streamhpc.com> Co-authored-by: Balint Soproni <balint@streamhpc.com>
1 parent 610da50 commit ff72c38

File tree

321 files changed

+13434
-1781
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

321 files changed

+13434
-1781
lines changed

.gitlab-ci.yml

Lines changed: 248 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# ########################################################################
2-
# Copyright 2019-2023 Advanced Micro Devices, Inc.
2+
# Copyright 2019-2024 Advanced Micro Devices, Inc.
33
# ########################################################################
44

55
include:
@@ -10,18 +10,23 @@ include:
1010
- /deps-cmake.yaml
1111
- /deps-docs.yaml
1212
- /deps-rocm.yaml
13+
- /deps-windows.yaml
14+
- /deps-nvcc.yaml
1315
- /gpus-rocm.yaml
16+
- /gpus-nvcc.yaml
1417
- /rules.yaml
1518

1619
stages:
1720
- lint
18-
- build # Tests if builds succeed (CMake)
19-
- test # Tests if unit tests are passing (CTest)
21+
- build # Tests if builds succeed (CMake)
22+
- test # Tests if unit tests are passing (CTest)
23+
- benchmark # Runs the non-internal benchmarks (Google Benchmark)
2024

2125
variables:
2226
# Helper variables
2327
PACKAGE_DIR: $BUILD_DIR/package
2428
ROCPRIM_GIT_BRANCH: develop_stream
29+
ROCPRIM_DIR: ${CI_PROJECT_DIR}/rocPRIM
2530

2631
copyright-date:
2732
extends:
@@ -56,44 +61,52 @@ copyright-date:
5661
.install-rocprim:
5762
script:
5863
- branch_name="$ROCPRIM_GIT_BRANCH"
59-
- if [ $CI_COMMIT_BRANCH == develop ] || [ $CI_COMMIT_BRANCH == master ]; then branch_name=$CI_COMMIT_BRANCH;
64+
- if [[ $CI_COMMIT_BRANCH == "develop" ]] || [[ $CI_COMMIT_BRANCH == "master" ]]; then branch_name=$CI_COMMIT_BRANCH;
6065
- fi;
61-
- git clone -b $branch_name https://gitlab-ci-token:${CI_JOB_TOKEN}@${ROCPRIM_GIT_URL} $CI_PROJECT_DIR/rocPRIM
66+
- git clone -b $branch_name https://gitlab-ci-token:${CI_JOB_TOKEN}@${ROCPRIM_GIT_URL} $ROCPRIM_DIR
6267
- cmake
6368
-G Ninja
6469
-D CMAKE_CXX_COMPILER=hipcc
6570
-D CMAKE_BUILD_TYPE=Release
6671
-D BUILD_TEST=OFF
6772
-D BUILD_EXAMPLE=OFF
6873
-D ROCM_DEP_ROCMCORE=OFF
69-
-S $CI_PROJECT_DIR/rocPRIM
70-
-B $CI_PROJECT_DIR/rocPRIM/build
71-
- cd $CI_PROJECT_DIR/rocPRIM/build
74+
-S $ROCPRIM_DIR
75+
-B $ROCPRIM_DIR/build
76+
- cd $ROCPRIM_DIR/build
7277
- cpack
7378
-G "DEB"
7479
- $SUDO_CMD dpkg -i rocprim*.deb
7580

7681
.build:common:
7782
stage: build
83+
tags:
84+
- build
7885
extends:
7986
- .gpus:rocm-gpus
8087
- .rules:build
81-
tags:
82-
- build
88+
variables:
89+
EXTRA_CMAKE_CXX_FLAGS: ""
8390
script:
8491
- !reference [.install-rocprim, script]
85-
# Setup env vars for testing
86-
- rng_seed_count=0; prng_seeds="0";
87-
- if [ $CI_COMMIT_BRANCH == develop_stream ] ; then rng_seed_count=3; prng_seeds="0, 1000";
88-
- fi;
92+
- | # Setup env vars for testing
93+
rng_seed_count=0; prng_seeds="0";
94+
if [[ $CI_COMMIT_BRANCH == "develop_stream" ]]; then
95+
rng_seed_count=3
96+
prng_seeds="0, 1000"
97+
fi
98+
- | # Add hardened libc++ assertions for tests only
99+
if [[ $BUILD_TARGET == "TEST" ]]; then
100+
echo "Configuring with hardened libc++!"
101+
EXTRA_CMAKE_CXX_FLAGS+=" -D_GLIBCXX_ASSERTIONS=ON"
102+
fi
89103
# Build rocThrust
90104
- cmake
91105
-G Ninja
92106
-D CMAKE_CXX_COMPILER=hipcc
93-
-D CMAKE_BUILD_TYPE=Release
94-
-D BUILD_TEST=ON
95-
-D BUILD_EXAMPLES=ON
96-
-D BUILD_BENCHMARKS=ON
107+
-D CMAKE_CXX_FLAGS="$EXTRA_CMAKE_CXX_FLAGS"
108+
-D CMAKE_BUILD_TYPE=$BUILD_TYPE
109+
-D BUILD_$BUILD_TARGET=ON
97110
-D GPU_TARGETS=$GPU_TARGETS
98111
-D AMDGPU_TEST_TARGETS=$GPU_TARGETS
99112
-D RNG_SEED_COUNT=$rng_seed_count
@@ -103,6 +116,7 @@ copyright-date:
103116
- cmake --build $CI_PROJECT_DIR/build
104117
artifacts:
105118
paths:
119+
- $CI_PROJECT_DIR/build/benchmarks/*
106120
- $CI_PROJECT_DIR/build/test/*
107121
- $CI_PROJECT_DIR/build/testing/*
108122
- $CI_PROJECT_DIR/build/deps/*
@@ -118,12 +132,20 @@ build:cmake-latest:
118132
extends:
119133
- .cmake-latest
120134
- .build:common
135+
parallel:
136+
matrix:
137+
- BUILD_TYPE: Release
138+
BUILD_TARGET: [BENCHMARKS, TEST, EXAMPLES]
121139

122140
build:cmake-minimum:
123141
stage: build
124142
extends:
125143
- .cmake-minimum
126144
- .build:common
145+
parallel:
146+
matrix:
147+
- BUILD_TYPE: Release
148+
BUILD_TARGET: [BENCHMARKS, TEST, EXAMPLES]
127149

128150
build:package:
129151
stage: build
@@ -149,6 +171,53 @@ build:package:
149171
- $PACKAGE_DIR/rocthrust*.zip
150172
expire_in: 2 weeks
151173

174+
build:windows:
175+
stage: build
176+
needs: []
177+
extends:
178+
- .rules:build
179+
- .gpus:rocm-windows
180+
- .deps:rocm-windows
181+
- .deps:visual-studio-devshell
182+
script:
183+
# Download, configure, and install rocPRIM
184+
- $BRANCH_NAME=$ROCPRIM_GIT_BRANCH
185+
- if ( $CI_COMMIT_BRANCH -eq "develop" -or $CI_COMMIT_BRANCH -eq "master" ) { $branch_name=$CI_COMMIT_BRANCH }
186+
- git clone -b $BRANCH_NAME https://gitlab-ci-token:$CI_JOB_TOKEN@$ROCPRIM_GIT_URL $ROCPRIM_DIR
187+
- \& cmake
188+
-S "$ROCPRIM_DIR"
189+
-B "$ROCPRIM_DIR/build"
190+
-G Ninja
191+
-D CMAKE_BUILD_TYPE=Release
192+
-D GPU_TARGETS=$GPU_TARGET
193+
-D BUILD_TEST=OFF
194+
-D BUILD_EXAMPLE=OFF
195+
-D BUILD_BENCHMARK=OFF
196+
-D BUILD_SHARED_LIBS=$BUILD_SHARED_LIBS
197+
-D CMAKE_CXX_COMPILER:FILEPATH="${env:HIP_PATH}/bin/clang++.exe"
198+
-D CMAKE_INSTALL_PREFIX:PATH="$ROCPRIM_DIR/build/install" *>&1
199+
- \& cmake --build "$ROCPRIM_DIR/build" --target install *>&1
200+
# Configure and build rocThrust
201+
- \& cmake
202+
-S "$CI_PROJECT_DIR"
203+
-B "$CI_PROJECT_DIR/build"
204+
-G Ninja
205+
-D CMAKE_BUILD_TYPE=Release
206+
-D GPU_TARGETS=$GPU_TARGET
207+
-D BUILD_TEST=ON
208+
-D BUILD_EXAMPLES=OFF
209+
-D BUILD_BENCHMARKS=OFF
210+
-D CMAKE_CXX_FLAGS=-Wno-deprecated-declarations
211+
-D CMAKE_CXX_COMPILER:FILEPATH="${env:HIP_PATH}/bin/clang++.exe"
212+
-D CMAKE_INSTALL_PREFIX:PATH="$CI_PROJECT_DIR/build/install"
213+
-D CMAKE_PREFIX_PATH:PATH="$ROCPRIM_DIR/build/install;${env:HIP_PATH}" *>&1
214+
- \& cmake --build "$CI_PROJECT_DIR/build" *>&1
215+
artifacts:
216+
paths:
217+
- $CI_PROJECT_DIR/build/
218+
- $ROCPRIM_DIR/build/install
219+
expire_in: 2 weeks
220+
152221
test:package:
153222
stage: test
154223
needs:
@@ -157,7 +226,7 @@ test:package:
157226
- .cmake-minimum
158227
- .rules:test
159228
tags:
160-
- build
229+
- rocm
161230
script:
162231
- !reference [.install-rocprim, script]
163232
- $SUDO_CMD dpkg -i $PACKAGE_DIR/rocthrust*.deb
@@ -168,8 +237,11 @@ test:package:
168237
-G Ninja
169238
-D CMAKE_CXX_COMPILER=hipcc
170239
-D CMAKE_BUILD_TYPE=Release
240+
-D GPU_TARGETS=$GPU_TARGETS
171241
-D ROCPRIM_ROOT=/opt/rocm/rocprim
172242
- cmake --build $CI_PROJECT_DIR/package_test
243+
- cd $CI_PROJECT_DIR/package_test
244+
- ctest --output-on-failure
173245
# Remove rocPRIM and rocThrust
174246
- $SUDO_CMD dpkg -r rocthrust-dev
175247
- $SUDO_CMD dpkg -r rocprim-dev
@@ -189,7 +261,11 @@ test:
189261
- .rules:test
190262
- .gpus:rocm
191263
needs:
192-
- build:cmake-minimum
264+
- job: build:cmake-minimum
265+
parallel:
266+
matrix:
267+
- BUILD_TYPE: Release
268+
BUILD_TARGET: TEST
193269
script:
194270
- cd $CI_PROJECT_DIR/build
195271
- cmake
@@ -205,3 +281,155 @@ test:
205281
--tests-regex $GPU_TARGET
206282
--resource-spec-file ./resources.json
207283
--parallel $PARALLEL_JOBS
284+
285+
.rocm-windows:test:
286+
extends:
287+
- .gpus:rocm-windows
288+
- .rules:test
289+
stage: test
290+
script:
291+
- \& ctest --test-dir "$CI_PROJECT_DIR/build" --output-on-failure --no-tests=error *>&1
292+
293+
test:rocm-windows:
294+
extends:
295+
- .rocm-windows:test
296+
needs:
297+
- build:windows
298+
299+
.rocm-windows:test-install:
300+
extends:
301+
- .deps:rocm-windows
302+
- .deps:visual-studio-devshell
303+
- .gpus:rocm-windows
304+
- .rules:test
305+
stage: test
306+
script:
307+
- \& cmake --build "$CI_PROJECT_DIR/build" --target install *>&1
308+
- \& cmake
309+
-G Ninja
310+
-S "$CI_PROJECT_DIR/extra"
311+
-B "$CI_PROJECT_DIR/build/package_test"
312+
-D CMAKE_BUILD_TYPE=Release
313+
-D GPU_TARGETS=$GPU_TARGET
314+
-D CMAKE_CXX_COMPILER:FILEPATH="${env:HIP_PATH}/bin/clang++.exe"
315+
-D CMAKE_PREFIX_PATH:PATH="$ROCPRIM_DIR/build/install;${env:HIP_PATH}" *>&1
316+
- \& cmake --build "$CI_PROJECT_DIR/build/package_test" *>&1
317+
- \& ctest --test-dir "$CI_PROJECT_DIR/build/package_test" --output-on-failure --no-tests=error *>&1
318+
319+
test:rocm-windows-install:
320+
extends:
321+
- .rocm-windows:test-install
322+
needs:
323+
- build:windows
324+
325+
.nvcc:
326+
extends:
327+
- .deps:nvcc
328+
- .gpus:nvcc-gpus
329+
- .deps:cmake-latest
330+
- .rules:manual
331+
before_script:
332+
- !reference [".deps:nvcc", before_script]
333+
- !reference [".deps:cmake-latest", before_script]
334+
335+
build:cuda-and-omp:
336+
stage: build
337+
extends:
338+
- .nvcc
339+
- .rules:build
340+
tags:
341+
- build
342+
variables:
343+
CCCL_GIT_BRANCH: v2.3.2
344+
CCCL_DIR: ${CI_PROJECT_DIR}/cccl
345+
needs: []
346+
script:
347+
- git clone -b $CCCL_GIT_BRANCH https://github.com/NVIDIA/cccl.git $CCCL_DIR
348+
# Replace CCCL Thrust headers with rocThrust headers
349+
- rm -R $CCCL_DIR/thrust/thrust
350+
- cp -r $CI_PROJECT_DIR/thrust $CCCL_DIR/thrust
351+
# Build tests and examples from CCCL Thrust
352+
- cmake
353+
-G Ninja
354+
-D CMAKE_BUILD_TYPE=Release
355+
-D CMAKE_CUDA_ARCHITECTURES="$GPU_TARGETS"
356+
-D THRUST_ENABLE_TESTING=ON
357+
-D THRUST_ENABLE_EXAMPLES=ON
358+
-D THRUST_ENABLE_BENCHMARKS=OFF
359+
-D THRUST_ENABLE_MULTICONFIG=ON
360+
-D THRUST_MULTICONFIG_ENABLE_SYSTEM_OMP=ON
361+
-D THRUST_MULTICONFIG_ENABLE_SYSTEM_CUDA=ON
362+
-B $CI_PROJECT_DIR/build
363+
-S $CCCL_DIR/thrust
364+
- cmake --build $CI_PROJECT_DIR/build
365+
- cd $CI_PROJECT_DIR/build
366+
- ctest --output-on-failure --tests-regex "thrust.example.cmake.add_subdir|thrust.test.cmake.check_source_files"
367+
artifacts:
368+
paths:
369+
- $CI_PROJECT_DIR/build/bin/
370+
- $CI_PROJECT_DIR/build/CMakeCache.txt
371+
- $CI_PROJECT_DIR/build/examples/cuda/CTestTestfile.cmake
372+
- $CI_PROJECT_DIR/build/examples/CTestTestfile.cmake
373+
- $CI_PROJECT_DIR/build/testing/unittest/CTestTestfile.cmake
374+
- $CI_PROJECT_DIR/build/testing/async/CTestTestfile.cmake
375+
- $CI_PROJECT_DIR/build/testing/omp/CTestTestfile.cmake
376+
- $CI_PROJECT_DIR/build/testing/cuda/CTestTestfile.cmake
377+
- $CI_PROJECT_DIR/build/testing/regression/CTestTestfile.cmake
378+
- $CI_PROJECT_DIR/build/testing/cpp/CTestTestfile.cmake
379+
- $CI_PROJECT_DIR/build/testing/CTestTestfile.cmake
380+
- $CI_PROJECT_DIR/build/CTestTestfile.cmake
381+
- $CCCL_DIR/thrust/cmake/ThrustRunTest.cmake
382+
- $CCCL_DIR/thrust/cmake/ThrustRunExample.cmake
383+
- $CI_PROJECT_DIR/build/.ninja_log
384+
expire_in: 1 week
385+
386+
test:cuda-and-omp:
387+
stage: test
388+
needs:
389+
- build:cuda-and-omp
390+
extends:
391+
- .nvcc
392+
- .gpus:nvcc
393+
- .rules:test
394+
before_script:
395+
# This is only needed because of the legacy before_script in .gpus:nvcc would otherwise overwrite before_script
396+
- !reference [.nvcc, before_script]
397+
script:
398+
- cd $CI_PROJECT_DIR/build
399+
# These tests are executed on the build stage because they require sources
400+
- ctest --output-on-failure --exclude-regex "thrust.example.cmake.add_subdir|thrust.test.cmake.check_source_files"
401+
402+
.benchmark-base:
403+
stage: benchmark
404+
extends:
405+
- .rules:benchmark
406+
variables:
407+
BENCHMARK_RESULT_DIR: ${CI_PROJECT_DIR}/benchmark_results
408+
BENCHMARK_RESULT_CACHE_DIR: ${BENCHMARK_RESULT_DIR}_cache
409+
410+
benchmark:
411+
needs:
412+
- build:cmake-minimum
413+
extends:
414+
- .cmake-minimum
415+
- .gpus:rocm
416+
- .benchmark-base
417+
variables:
418+
BENCHMARK_FILENAME_REGEX: ^benchmark
419+
BENCHMARK_ALGORITHM_REGEX: ""
420+
timeout: 3h
421+
script:
422+
- 'printf "CI Variables used in benchmarks:\nBENCHMARK_RESULT_DIR: %s\nBENCHMARK_FILENAME_REGEX: %s\nBENCHMARK_ALGORITHM_REGEX: %s \n" "$BENCHMARK_RESULT_DIR" "$BENCHMARK_FILENAME_REGEX" "$BENCHMARK_ALGORITHM_REGEX"'
423+
- cd "${CI_PROJECT_DIR}"
424+
- mkdir -p "${BENCHMARK_RESULT_DIR}"
425+
- python3
426+
.gitlab/run_benchmarks.py
427+
--benchmark_dir "${CI_PROJECT_DIR}/build/benchmarks"
428+
--benchmark_gpu_architecture "${GPU_TARGET}"
429+
--benchmark_output_dir "${BENCHMARK_RESULT_DIR}"
430+
--benchmark_filename_regex "${BENCHMARK_FILENAME_REGEX}"
431+
--benchmark_filter_regex "${BENCHMARK_ALGORITHM_REGEX}"
432+
artifacts:
433+
paths:
434+
- ${BENCHMARK_RESULT_DIR}
435+
expire_in: 1 week

0 commit comments

Comments
 (0)