Add CUTLASS v4.3.5 C++ headers to Modal runner image by msaroufim · Pull Request #441 · gpu-mode/kernelbot

msaroufim · 2026-02-09T03:02:53Z

Install CUTLASS C++ headers to /opt/cutlass so users can #include <cutlass/...> and #include <cute/...> in their submissions. Also adds a test script to validate the setup before deploying, and documents how to add new C++ deps.

github-actions · 2026-02-09T03:03:57Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
src/libkernelbot
utils.py
Project Total

_{This report was generated by python-coverage-comment-action}

Copilot

Pull request overview

Adds NVIDIA CUTLASS v4.3.5 header-only dependency to the Modal CUDA runner image and introduces a Modal test script to validate that CUDA compilation and PyTorch inline extensions can include CUTLASS/CuTe headers.

Changes:

Install CUTLASS headers into /opt/cutlass in the production runner image and set related env vars.
Add a Modal-based validation script that compiles a small nvcc program and a PyTorch load_inline extension.
Document a workflow for adding future C++ header dependencies to the runner image.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
src/runners/test_cutlass_image.py	Adds a Modal script that builds a test CUDA image, installs CUTLASS headers, and validates include/compile behavior.
src/runners/modal_runner.py	Installs CUTLASS v4.3.5 headers in the production image and documents how to add more C++ deps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T03:04:18Z

src/runners/modal_runner.py

    )
+    # CUTLASS C++ headers for #include <cutlass/...>
+    .run_commands(
+        "git clone --depth 1 --branch v4.3.5 https://github.com/NVIDIA/cutlass.git /opt/cutlass",


Cloning CUTLASS by tag at build time is a supply-chain risk because tags can be moved/retagged upstream. Consider pinning to an exact commit SHA (and ideally verifying it), e.g., by fetching the repo and checking out a known commit, to make the image build reproducible and tamper-resistant.

Suggested change

"git clone --depth 1 --branch v4.3.5 https://github.com/NVIDIA/cutlass.git /opt/cutlass",

# Pin CUTLASS to an exact commit SHA for reproducible, tamper-resistant builds.

# This SHA corresponds to the v4.3.5 release tag.

"git init /opt/cutlass && "

"cd /opt/cutlass && "

"git remote add origin https://github.com/NVIDIA/cutlass.git && "

"git fetch --depth 1 origin 9f2b20cdb57ee97dc0b1819b7358126d1c44cc97 && "

"git checkout --detach FETCH_HEAD",

Copilot · 2026-02-09T03:04:19Z

src/runners/test_cutlass_image.py

+    .run_commands(
+        "git clone --depth 1 --branch v4.3.5 https://github.com/NVIDIA/cutlass.git /opt/cutlass",
+    )


Same supply-chain concern as production: cloning by tag is not fully reproducible and can be retagged upstream. Pin to an immutable commit SHA (and/or validate the expected commit) so the pre-deploy validation script tests the exact dependency revision intended for production.

Copilot · 2026-02-09T03:04:19Z

src/runners/modal_runner.py

+    .env({
+        "CUTLASS_PATH": "/opt/cutlass",
+        "CPLUS_INCLUDE_PATH": "/opt/cutlass/include:/opt/cutlass/tools/util/include",
+    })


Setting CPLUS_INCLUDE_PATH to a fixed value replaces any existing include paths that may already be configured in the base image (or by future dependencies). Prefer prepending/appending to the existing value (while handling the empty/unset case) to avoid breaking other C++ builds that rely on CPLUS_INCLUDE_PATH.

Copilot · 2026-02-09T03:04:19Z

src/runners/test_cutlass_image.py

+    )
+    .env({
+        "CUTLASS_PATH": "/opt/cutlass",
+        "CPLUS_INCLUDE_PATH": "/opt/cutlass/include:/opt/cutlass/tools/util/include",


The test image also overwrites CPLUS_INCLUDE_PATH. To make this script robust (and to mirror best practice for the production image), consider prepending/appending to any existing value so this test doesn’t accidentally mask/include-path behavior changes from other packages.

Suggested change

"CPLUS_INCLUDE_PATH": "/opt/cutlass/include:/opt/cutlass/tools/util/include",

"CPLUS_INCLUDE_PATH": "/opt/cutlass/include:/opt/cutlass/tools/util/include:${CPLUS_INCLUDE_PATH:-}",

Copilot · 2026-02-09T03:04:20Z

src/runners/test_cutlass_image.py

+
+app = modal.App("test-cutlass-image")
+
+cuda_version = "13.1.0"


The base image is CUDA 13.1 (nvidia/cuda:13.1.0-...), but PyTorch below is installed from the cu130 index (CUDA 13.0). This mismatch can cause runtime/library or extension build/link issues (especially for load_inline). Align the CUDA toolkit version in the base image with the PyTorch wheel CUDA version (or vice versa) to ensure the test is validating the same CUDA stack users will run.

Suggested change

cuda_version = "13.1.0"

cuda_version = "13.0.0"

Copilot · 2026-02-09T03:04:20Z

src/runners/test_cutlass_image.py

+    .uv_pip_install(
+        "torch==2.9.1",
+        index_url="https://download.pytorch.org/whl/cu130",
+    )


This uses the cu130 wheel index while the image is CUDA 13.1 (see above). If the intention is to validate CUTLASS on the same CUDA version as the toolchain in the image, update either the base image CUDA tag or the PyTorch wheel index/version so they match.

Copilot · 2026-02-09T03:04:20Z

src/runners/test_cutlass_image.py

+        compile_cmd = [
+            "nvcc",
+            cu_file,
+            "-o", binary,
+            "-I", f"{cutlass_path}/include",
+            "-I", f"{cutlass_path}/tools/util/include",
+            "-std=c++17",
+            "-arch=sm_75",
+        ]


The script comment + image env setup suggests CPLUS_INCLUDE_PATH should make CUTLASS headers discoverable without -I flags, but the nvcc compilation test always passes explicit -I include paths. If you want this test to validate the env-based include behavior, add a compilation attempt that omits -I (or make that the primary path), so failures in CPLUS_INCLUDE_PATH wiring are caught.

Add CUTLASS v4.3.5 C++ headers to Modal runner image

f13b9be

Install CUTLASS C++ headers to /opt/cutlass so users can #include <cutlass/...> and #include <cute/...> in their submissions. Also adds a test script to validate the setup before deploying, and documents how to add new C++ deps.

Copilot AI review requested due to automatic review settings February 9, 2026 03:02

Copilot AI reviewed Feb 9, 2026

View reviewed changes

msaroufim added 2 commits February 8, 2026 19:04

Remove test_cutlass_image.py after validating CUTLASS setup

052f6cd

Remove Claude GitHub Actions workflows

aaa8459

msaroufim merged commit 3d766cf into main Feb 9, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add CUTLASS v4.3.5 C++ headers to Modal runner image#441

Add CUTLASS v4.3.5 C++ headers to Modal runner image#441
msaroufim merged 3 commits intomainfrom
add-cutlass-cpp-headers

msaroufim commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        "git clone --depth 1 --branch v4.3.5 https://github.com/NVIDIA/cutlass.git /opt/cutlass",
+        # Pin CUTLASS to an exact commit SHA for reproducible, tamper-resistant builds.
+        # This SHA corresponds to the v4.3.5 release tag.
+        "git init /opt/cutlass && "
+        "cd /opt/cutlass && "
+        "git remote add origin https://github.com/NVIDIA/cutlass.git && "
+        "git fetch --depth 1 origin 9f2b20cdb57ee97dc0b1819b7358126d1c44cc97 && "
+        "git checkout --detach FETCH_HEAD",

	"CPLUS_INCLUDE_PATH": "/opt/cutlass/include:/opt/cutlass/tools/util/include",
	"CPLUS_INCLUDE_PATH": "/opt/cutlass/include:/opt/cutlass/tools/util/include:${CPLUS_INCLUDE_PATH:-}",


		app = modal.App("test-cutlass-image")

		cuda_version = "13.1.0"

Comments

Conversation

msaroufim commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Coverage report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant