From d935e5546d4436285a934faf827409cb9247cfc5 Mon Sep 17 00:00:00 2001 From: Stanley Tsang Date: Wed, 20 Nov 2024 16:44:57 -0700 Subject: [PATCH] Merge back 6.3 hotfixes (#490) * Remove Thrust comments referencing website (#451) Referencing or using code from some websites is prohibited in rocThrust. Some comments with these kinds of references were recently added by Thrust, and then when we updated the API, were brought into rocThrust. This change removes the references in the comments. * Specify minimum version for Google benchmark (#450) * Remove Thrust comments referencing website (#447) Referencing or using code from some websites is prohibited in rocThrust. Some comments with these kinds of references were recently added by Thrust, and then when we updated the API, were brought into rocThrust. This change removes the references in the comments. * Bump rocm-docs-core from 1.6.2 to 1.7.1 in /docs/sphinx (#448) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.6.2 to 1.7.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.6.2...v1.7.1) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Specify minimum version for Google benchmark Pass a minimum version to find_package to prevent it from using outdated versions of Google benchmark that may be present on the system. --------- Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Temporarily disable scan tests, re-enable tests on gfx11xx (#449) * Remove Thrust comments referencing website (#447) Referencing or using code from some websites is prohibited in rocThrust. Some comments with these kinds of references were recently added by Thrust, and then when we updated the API, were brought into rocThrust. This change removes the references in the comments. * Temporarily disable scan tests, re-enable tests on gfx11xx Remove code that excludes gfx11xx tests on Jenkins, since they work there now. Add a temporary exclusion for test_thrust_scan, which needs a compiler fix. * Remove website URL from comments (#456) Referencing or using code from some websites is prohibited in this repository. This change removes an informational reference in the comments. * Add checks around some platform-specific benchmark code (#455) There were two spots in the new benchmark code that were causing compile-time issues on some Windows systems. This change adds a check to make sure we have 128-bit integer support before using int128_t in generation_utils.hpp. It also avoids calling clock_gettime on Windows, since it seems to be causing build issues there. Instead, I've restored the old Windows timing code from PR #431, which uses QueryPerformanceFrequency/Counter instead. * Add gfx1151 build target (#457) (#459) * Add gfx1151 target * Revert "Add gfx1151 target" This reverts commit 5889238d5e754f79fbb9d353b0bf4def528cfc78. * Add gfx1151 target while preserving address sanitizer targets --------- Co-authored-by: Stanley Tsang * Remove website reference (#460) Removed the link to more information from the CRC algorithm comments. * updated the changelog for 6.3 (#480) * updated the changelog for 6.3 * removed '(unreleased)' for 6.2 * added support for gfx12 and gfx1151 in default gpu list * updated changelog * fixed some minor grammar in changelog * Remove gfx940,gfx941 targets (#484) --------- Signed-off-by: dependabot[bot] Co-authored-by: Wayne Franz Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: amd-garydeng Co-authored-by: spolifroni-amd Co-authored-by: NguyenNhuDi Co-authored-by: Val Movsik <160653499+vamovsik@users.noreply.github.com> --- CHANGELOG.md | 19 +++++++++---------- CMakeLists.txt | 4 ++-- rmake.py | 7 +++++-- thrust/optional.h | 1 - 4 files changed, 16 insertions(+), 15 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index bb733e908..ce49cfafd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,15 +3,10 @@ Documentation for rocThrust available at [https://rocm.docs.amd.com/projects/rocThrust/en/latest/](https://rocm.docs.amd.com/projects/rocThrust/en/latest/). -## (Unreleased) rocThrust 3.x.x for ROCm 6.x - -### Changes - -* Changed the C++ version from 14 to 17. C++14 will be deprecated in the next major release. - -## (Unreleased) rocThrust 3.3.0 for ROCm 6.4 +## rocThrust 3.3.0 for ROCm 6.4 ### Added + * Added extended tests to `rtest.py`. These tests are extra tests that did not fit the criteria of smoke and regression tests. These tests will take much longer to run relative to smoke and regression tests. Use `python rtest.py [--emulation|-e|--test|-t]=extended` to run these tests. * Added regression tests to `rtest.py`. These tests recreate scenarios that have caused hardware problems in past emulation environments. Use `python rtest.py [--emulation|-e|--test|-t]=regression` to run these tests. * Added smoke test options, which runs a subset of the unit tests and ensures that less than 2gb of VRAM will be used. Use `python rtest.py [--emulation|-e|--test|-t]=smoke` to run these tests. @@ -24,11 +19,13 @@ Documentation for rocThrust available at * Updated HIPSTDPAR's `adjacent_find` to use rocPRIM's implementation ### Changed + +* Changed the C++ version from 14 to 17. C++14 will be deprecated in the next major release. * `--test|-t` is no longer a required flag for `rtest.py`. Instead, the user can use either `--emulation|-e` or `--test|-t`, but not both. * Split the contents of HIPSTDPAR's forwarding header into several implementation headers. * Fixed `copy_if` to work with large data types (512 bytes) -## (Unreleased) rocThrust 3.2.0 for ROCm 6.3 +## rocThrust 3.2.0 for ROCm 6.3 ### Added @@ -36,17 +33,19 @@ Documentation for rocThrust available at * Only the NVIDIA backend uses `tuple` and `pair` types from libcu++, other backends continue to use the original Thrust implementations and hence do not require libcu++ (CCCL) as a dependency. * Added the `thrust::hip::par_det` execution policy to enable bitwise reproducibility on algorithms that are not bitwise reproducible by default. -* Fix tests failing when compiling with `-D_GLIBCXX_ASSERTIONS=ON`. ### Changed +* Updated the default value for the `-a` argument from `rmake.py` to `gfx906:xnack-,gfx1030,gfx1100,gfx1101,gfx1102,gfx1151,gfx1200,gfx1201`. * Enabled the upstream (thrust) test suite for execution by default. It can still be disabled by CMake option `-DENABLE_UPSTREAM_TESTS=OFF`. ### Resolved issues +* Fixed an issue in `rmake.py` where the list storing cmake options would contain individual characters instead of a full string of options. * Fixed the HIP backend not passing `TestCopyIfNonTrivial` from the upstream (thrust) test suite. +* Fixed tests failing when compiled with `-D_GLIBCXX_ASSERTIONS=ON`. -## (Unreleased) rocThrust 3.1.0 for ROCm 6.2 +## rocThrust 3.1.0 for ROCm 6.2 ### Additions diff --git a/CMakeLists.txt b/CMakeLists.txt index d50d4ccdd..802f5c63b 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -59,11 +59,11 @@ if(GPU_TARGETS STREQUAL "all") if(BUILD_ADDRESS_SANITIZER) # ASAN builds require xnack rocm_check_target_ids(DEFAULT_AMDGPU_TARGETS - TARGETS "gfx908:xnack+;gfx90a:xnack+;gfx940:xnack+;gfx941:xnack+;gfx942:xnack+" + TARGETS "gfx908:xnack+;gfx90a:xnack+;gfx942:xnack+" ) else() rocm_check_target_ids(DEFAULT_AMDGPU_TARGETS - TARGETS "gfx803;gfx900:xnack-;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1200;gfx1201" + TARGETS "gfx803;gfx900:xnack-;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1200;gfx1201" ) endif() set(GPU_TARGETS "${DEFAULT_AMDGPU_TARGETS}" CACHE STRING "GPU architectures to compile for" FORCE) diff --git a/rmake.py b/rmake.py index 3c304ccf7..2bb8834a1 100644 --- a/rmake.py +++ b/rmake.py @@ -20,6 +20,9 @@ def parse_args(): parser = argparse.ArgumentParser(description=""" Checks build arguments """) + + default_gpus = 'gfx906:xnack-,gfx1030,gfx1100,gfx1101,gfx1102,gfx1151,gfx1200,gfx1201' + parser.add_argument('-g', '--debug', required=False, default=False, action='store_true', help='Generate Debug build (default: False)') parser.add_argument( '--build_dir', type=str, required=False, default="build", @@ -35,7 +38,7 @@ def parse_args(): help='Install after build (default: False)') parser.add_argument( '--cmake-darg', required=False, dest='cmake_dargs', action='append', default=[], help='List of additional cmake defines for builds (e.g. CMAKE_CXX_COMPILER_LAUNCHER=ccache)') - parser.add_argument('-a', '--architecture', dest='gpu_architecture', required=False, default="gfx906;gfx1030;gfx1100;gfx1101;gfx1102", #:sramecc+:xnack-" ) #gfx1030" ) #gfx906" ) # gfx1030" ) + parser.add_argument('-a', '--architecture', dest='gpu_architecture', required=False, default=default_gpus, #:sramecc+:xnack-" ) #gfx1030" ) #gfx906" ) # gfx1030" ) help='Set GPU architectures, e.g. all, gfx000, gfx803, gfx906:xnack-;gfx1030 (optional, default: all)') parser.add_argument('-v', '--verbose', required=False, default=False, action='store_true', help='Verbose build (default: False)') @@ -108,7 +111,7 @@ def config_cmd(): else: cmake_executable = "cmake" toolchain = "toolchain-linux.cmake" - cmake_platform_opts = f"-DROCM_DIR:PATH={rocm_path} -DCPACK_PACKAGING_INSTALL_PREFIX={rocm_path}" + cmake_platform_opts = [f"-DROCM_DIR:PATH={rocm_path}", f"-DCPACK_PACKAGING_INSTALL_PREFIX={rocm_path}"] tools = f"-DCMAKE_TOOLCHAIN_FILE={toolchain}" cmake_options.append( tools ) diff --git a/thrust/optional.h b/thrust/optional.h index 1e19dcd16..707eba916 100644 --- a/thrust/optional.h +++ b/thrust/optional.h @@ -227,7 +227,6 @@ template struct is_const_or_const_ref : std::true_type{}; #endif // std::invoke from C++17 -// https://stackoverflow.com/questions/38288042/c11-14-invoke-workaround THRUST_EXEC_CHECK_DISABLE template