Skip to content

Automate SBOM generation for all CI images#309

Open
jayavenkatesh19 wants to merge 9 commits intorapidsai:mainfrom
jayavenkatesh19:sbom-generation
Open

Automate SBOM generation for all CI images#309
jayavenkatesh19 wants to merge 9 commits intorapidsai:mainfrom
jayavenkatesh19:sbom-generation

Conversation

@jayavenkatesh19
Copy link
Contributor

@jayavenkatesh19 jayavenkatesh19 commented Oct 2, 2025

Towards https://github.com/rapidsai/build-infra/issues/280

Current Approach

PR builds

  • Image built and published on rapidsai/staging on Dockerhub
  • Image tag is prepended with PR number gathered from GITHUB_REF
    Branch push
  • Image built and published to rapidsai/<image_repo> on Dockerhub
  • Image tag gathered from compute matrix is used.

Proposed changes using the multi-stage build approach

  • Add a new stage in each Dockerfile called syft-base with the Syft binary installed on a minimal alpine 3.20 image.
  • The main docker build is done using a stage called <ci-img>-base to differentiate it from the final image.
  • Another stage is added called <ci-img>-sbom where the built stage is mounted to a specified location on the syft-base stage
  • A syft-scan is done on the mounted location, and an SBOM is generated.
  • The generated SBOM is then copied to the final stage, with image name and tags kept unchanged to ensure no changes to how these images are built and published.

@jayavenkatesh19 jayavenkatesh19 self-assigned this Oct 2, 2025
@jayavenkatesh19 jayavenkatesh19 marked this pull request as ready for review October 20, 2025 23:53
@jayavenkatesh19 jayavenkatesh19 requested a review from a team as a code owner October 20, 2025 23:53
@jayavenkatesh19 jayavenkatesh19 requested review from msarahan and removed request for a team October 20, 2025 23:53
@jayavenkatesh19 jayavenkatesh19 changed the title [WIP] Automate SBOM generation for all CI images Automate SBOM generation for all CI images Oct 28, 2025
@jameslamb jameslamb requested review from jameslamb and removed request for msarahan January 23, 2026 18:16
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this in a multi-stage build makes sense to me!

I left some suggestions on standardizing things and make the configuration flow a little stricter.

ARG BUILDPLATFORM
ARG SYFT_VER

RUN apk add --no-cache curl tar ca-certificates \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is repeated in multiple places, would you consider moving it into a script that's mounted in at build time?

Like this: rapidsai/docker#840

################################ build the syft-base image ###############################

FROM --platform=$BUILDPLATFORM alpine:3.20 AS syft-base
ARG BUILDPLATFORM
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was confused not to see changes in the CI workflows to ensure this is passed in, but now I see... it's defined in the build environment by default: https://docs.docker.com/build/building/multi-platform/#cross-compilation

Just sharing for the benefit of other reviewers.

Comment on lines +8 to +12
ARG SYFT_VER=1.32.0

################################ build the syft-base image ###############################

FROM --platform=$BUILDPLATFORM alpine:3.20 AS syft-base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ARG SYFT_VER=1.32.0
################################ build the syft-base image ###############################
FROM --platform=$BUILDPLATFORM alpine:3.20 AS syft-base
ARG SYSFT_ALPINE_VER=notset
ARG SYFT_VER=notset
################################ build the syft-base image ###############################
FROM --platform=$BUILDPLATFORM alpine:${SYFT_ALPINE_VER} AS syft-base

Let's put the Alpine version and SYFT_VER in versions.yaml instead: https://github.com/rapidsai/ci-imgs/blob/main/versions.yaml. And let's please avoid putting any hard-coded versions into ARG statements and instead using notset (to give us a chance to catch bugs like "did not successfully pass configuration through).

  • keeps it consistent across images
  • allows us to use renovate to easily auto-update it

EOF

FROM miniforge-cuda
# Generate SBOM for the miniforge-cuda stage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently removed miniforge-cuda and 26.02 will be the final release where it's published: #345

We only need to generate SBOMs for ci-conda, ci-wheel, and ci-testwheel.

COPY pip.conf /etc/xdg/pip/pip.conf

# Generate SBOM for the citestwheel image
FROM syft-base AS citestwheel-sbom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM syft-base AS citestwheel-sbom
FROM syft-base AS sbom

I don't think it's necessary to add citestwheel- and similar prefixes to these stage names. They're already self-contained within 1 Dockerfile.

I recommend standardizing all of them to something generic.

Comment on lines +231 to +236
mkdir -p /out && \
syft scan \
--source-name "rapidsai/citestwheel" \
--scope all-layers \
--output cyclonedx-json@1.6=/out/sbom.json \
dir:/rootfs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that seems to differ in this call across the dockerfiles is --source-name, and I'm guessing we'd want the other configuration for syft to otherwise be consistent across all images.

Could you move this into a script that's mounted in at build time, similar to rapidsai/docker#840?

The --source-name could be provided by a new build argument IMAGE_REPO or similar, we already have enough information about that in the GitHub Actions configs to thread that through:

IMAGE_REPO: ${{ inputs.IMAGE_REPO }}

@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 2, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants