-
Notifications
You must be signed in to change notification settings - Fork 8
Add GPU backends to distributed CI pipeline #1012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
1fd6389
cbb1891
bbb151c
b9be7fb
731283a
ea2b3aa
8f04d36
9f96b70
9fce9b5
c6a767e
0b9d26b
adb1ee6
b0321e7
c62979c
b4071d0
6eb3d8d
28b1b1b
73a5b5b
d8e90e4
67cfdb5
c81af9e
790612a
64482e8
b3eef3a
d6f71d6
7b68f7b
518bbde
c1eed7f
d08b60c
148850c
c6d0042
0c727f5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,27 +1,124 @@ | ||
| FROM ubuntu:25.04 | ||
| FROM ubuntu:25.10 | ||
|
|
||
| ENV LANG C.UTF-8 | ||
| ENV LC_ALL C.UTF-8 | ||
|
|
||
| ARG DEBIAN_FRONTEND=noninteractive | ||
| RUN apt-get update -qq && apt-get install -qq -y --no-install-recommends \ | ||
| strace \ | ||
| build-essential \ | ||
| tar \ | ||
| wget \ | ||
| curl \ | ||
| libboost-dev \ | ||
| libnuma-dev \ | ||
| libopenmpi-dev \ | ||
| ca-certificates \ | ||
| libssl-dev \ | ||
| autoconf \ | ||
| automake \ | ||
| libtool \ | ||
| pkg-config \ | ||
| libreadline-dev \ | ||
| git && \ | ||
| RUN apt-get update && \ | ||
| apt-get install -y --no-install-recommends \ | ||
| autoconf \ | ||
| automake \ | ||
| build-essential \ | ||
| ca-certificates \ | ||
| curl \ | ||
| git \ | ||
| libboost-dev \ | ||
| libconfig-dev \ | ||
| libcurl4-openssl-dev \ | ||
| libfuse-dev \ | ||
| libjson-c-dev \ | ||
| libnl-3-dev \ | ||
| libnuma-dev \ | ||
| libreadline-dev \ | ||
| libsensors-dev \ | ||
| libssl-dev \ | ||
| libtool \ | ||
| libuv1-dev \ | ||
| libyaml-dev \ | ||
| nvidia-cuda-dev \ | ||
| nvidia-cuda-toolkit \ | ||
| nvidia-cuda-toolkit-gcc \ | ||
| pkg-config \ | ||
| python3 \ | ||
| strace \ | ||
| tar \ | ||
| wget && \ | ||
| rm -rf /var/lib/apt/lists/* | ||
|
|
||
| ENV CC=/usr/bin/cuda-gcc | ||
| ENV CXX=/usr/bin/cuda-g++ | ||
| ENV CUDAHOSTCXX=/usr/bin/cuda-g++ | ||
|
|
||
| # Install OpenMPI configured with libfabric, libcxi, and gdrcopy support for use | ||
| # on Alps. This is based on examples in | ||
| # https://github.com/eth-cscs/cray-network-stack. | ||
| ARG gdrcopy_version=2.5.1 | ||
| RUN set -eux; \ | ||
| git clone --depth 1 --branch "v${gdrcopy_version}" https://github.com/NVIDIA/gdrcopy.git; \ | ||
| cd gdrcopy; \ | ||
| make lib -j"$(nproc)" lib_install; \ | ||
| cd /; \ | ||
| rm -rf /gdrcopy; \ | ||
| ldconfig | ||
|
|
||
| ARG cassini_headers_version=release/shs-13.0.0 | ||
| RUN set -eux; \ | ||
| git clone --depth 1 --branch "${cassini_headers_version}" https://github.com/HewlettPackard/shs-cassini-headers.git; \ | ||
| cd shs-cassini-headers; \ | ||
| cp -r include/* /usr/include/; \ | ||
| cp -r share/* /usr/share/; \ | ||
| rm -rf /shs-cassini-headers | ||
|
|
||
| ARG cxi_driver_version=release/shs-13.0.0 | ||
| RUN set -eux; \ | ||
| git clone --depth 1 --branch "${cxi_driver_version}" https://github.com/HewlettPackard/shs-cxi-driver.git; \ | ||
| cd shs-cxi-driver; \ | ||
| cp -r include/* /usr/include/; \ | ||
| rm -rf /shs-cxi-driver | ||
|
|
||
| ARG libcxi_version=release/shs-13.0.0 | ||
| RUN set -eux; \ | ||
| git clone --depth 1 --branch "${libcxi_version}" https://github.com/HewlettPackard/shs-libcxi.git; \ | ||
| cd shs-libcxi; \ | ||
| ./autogen.sh; \ | ||
| ./configure \ | ||
| --with-cuda; \ | ||
| make -j"$(nproc)" install; \ | ||
| cd /; \ | ||
| rm -rf /shs-libcxi; \ | ||
| ldconfig | ||
|
|
||
| ARG xpmem_version=0d0bad4e1d07b38d53ecc8f20786bb1328c446da | ||
| RUN set -eux; \ | ||
| git clone https://github.com/hpc/xpmem.git; \ | ||
| cd xpmem; \ | ||
| git checkout "${xpmem_version}"; \ | ||
| ./autogen.sh; \ | ||
| ./configure --disable-kernel-module; \ | ||
| make -j"$(nproc)" install; \ | ||
| cd /; \ | ||
| rm -rf /xpmem; \ | ||
| ldconfig | ||
|
|
||
| # NOTE: xpmem is not found correctly without setting the prefix explicitly in | ||
| # --enable-xpmem | ||
| ARG libfabric_version=v2.4.0 | ||
| RUN set -eux; \ | ||
| git clone --depth 1 --branch "${libfabric_version}" https://github.com/ofiwg/libfabric.git; \ | ||
| cd libfabric; \ | ||
| ./autogen.sh; \ | ||
| ./configure \ | ||
| --with-cuda \ | ||
| --enable-xpmem=/usr \ | ||
| --enable-tcp \ | ||
| --enable-cxi; \ | ||
| make -j"$(nproc)" install; \ | ||
| cd /; \ | ||
| rm -rf /libfabric; \ | ||
| ldconfig | ||
|
|
||
| ARG openmpi_version=5.0.9 | ||
| RUN set -eux; \ | ||
| curl -fsSL "https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-${openmpi_version}.tar.gz" -o /tmp/ompi.tar.gz; \ | ||
| tar -C /tmp -xzf /tmp/ompi.tar.gz; \ | ||
| cd "/tmp/openmpi-${openmpi_version}"; \ | ||
| ./configure \ | ||
| --with-ofi \ | ||
| --with-cuda=/usr; \ | ||
| make -j"$(nproc)" install; \ | ||
| cd /; \ | ||
| rm -rf "/tmp/openmpi-${openmpi_version}" /tmp/ompi.tar.gz; \ | ||
| ldconfig | ||
|
|
||
| # Install uv: https://docs.astral.sh/uv/guides/integration/docker | ||
| COPY --from=ghcr.io/astral-sh/uv:0.9.24@sha256:816fdce3387ed2142e37d2e56e1b1b97ccc1ea87731ba199dc8a25c04e4997c5 /uv /uvx /bin/ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -361,7 +361,7 @@ url = 'https://gridtools.github.io/pypi/' | |
|
|
||
| [tool.uv.sources] | ||
| dace = {index = "gridtools"} | ||
| ghex = {git = "https://github.com/msimberg/GHEX.git", branch = "async-mpi"} | ||
| ghex = {git = "https://github.com/philip-paul-mueller/GHEX.git", branch = "phimuell__async-mpi-2"} | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is updated because ghex-org/GHEX#190 contains a bugfix to how strides are computed for GPU buffers. Tests fail with |
||
| # gt4py = {git = "https://github.com/GridTools/gt4py", branch = "main"} | ||
| # gt4py = {index = "test.pypi"} | ||
| icon4py-atmosphere-advection = {workspace = true} | ||
|
|
||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Uh oh!
There was an error while loading. Please reload this page.