Skip to content

fix: Fix regression in Helm post-renderer error handling#9997

Open
psliwka wants to merge 1 commit intoGoogleContainerTools:mainfrom
psliwka:fix-post-renderer-regression
Open

fix: Fix regression in Helm post-renderer error handling#9997
psliwka wants to merge 1 commit intoGoogleContainerTools:mainfrom
psliwka:fix-post-renderer-regression

Conversation

@psliwka
Copy link

@psliwka psliwka commented Mar 2, 2026

In Skaffold v1, user-provided Helm post-renderers (configured by adding an appropriate --post-renderer flag to deploy.helm.flags in skaffold.yaml) used to be processed by Helm itself. In case a post-renderer crashed and returned a non-zero exit code, Helm detected that and aborted deployment process. This was an important safety measure, since a failing post-renderer is very likely to output incorrect manifests, or no manifests at all.

In Skaffold v2, things work differently: Skaffold injects itself as Helm's one and only post-renderer, and any custom post-renderer specified in skaffold.yaml is being executed from within skaffold filter subprocess, invoked by Helm. The Skaffold's implementation of handling post-renderers is mostly similar to the Helm's one, except one critical detail: the post-renderer's exit code is never checked, causing its output to be blindly consumed by the deployment pipeline, even in case the post-renderer crashes and outputs nothing to its stdout. In practice, this usually causes a seemingly successful deployment reported by Skaffold, while all Kubernetes objects deployed by a previous (successful) release are silently destroyed, causing unexpected service outage.

This commit fixes the issue by by bringing skaffold filter behavior in-line to the Helm's (and thus also Skaffold v1's) one, i.e. a crash of post-renderer stops the deployment pipeline, and a meaningful error message is returned to the user to help with debugging the issue.

NB While I'm not the biggest fan of the error message's format, I intentionally made it to be exactly the same as its Helm counterpart, which I think it's only fair given that we replicate Helm's native behavior here.

Fixes #9377.

In Skaffold v1, user-provided Helm post-renderers (configured by adding
an appropriate `--post-renderer` flag to `deploy.helm.flags` in
skaffold.yaml) used to be processed by Helm itself. In case a
post-renderer crashed and returned a non-zero exit code, Helm detected
that and aborted deployment process. This was an important safety
measure, since a failing post-renderer is very likely to output
incorrect manifests, or no manifests at all.

In Skaffold v2, things work differently: Skaffold injects itself as
Helm's one and only post-renderer, and any custom post-renderer
specified in skaffold.yaml is being executed from within `skaffold
filter` subprocess, invoked by Helm. The Skaffold's implementation of
handling post-renderers is mostly similar to the Helm's one, except one
critical detail: the post-renderer's exit code is never checked, causing
its output to be blindly consumed by the deployment pipeline, even in
case the post-renderer crashes and outputs nothing to its stdout. In
practice, this usually causes a seemingly successful deployment reported
by Skaffold, while all Kubernetes objects deployed by a previous
(successful) release are silently destroyed, causing unexpected service
outage.

This commit fixes the issue by by bringing `skaffold filter` behavior
in-line to the Helm's (and thus also Skaffold v1's) one, i.e. a crash of
post-renderer stops the deployment pipeline, and a meaningful error
message is returned to the user to help with debugging the issue.

NB While I'm not the biggest fan of the error message's format, I
intentionally made it to be exactly the same as its Helm counterpart,
which I think it's only fair given that we replicate Helm's native
behavior here.

Fixes GoogleContainerTools#9377.
@psliwka psliwka requested a review from a team as a code owner March 2, 2026 08:37
@google-cla
Copy link

google-cla bot commented Mar 2, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a significant regression in Skaffold v2's handling of Helm post-renderers. Previously, a crashing post-renderer would not halt the deployment process, potentially leading to silent data destruction. The changes ensure that skaffold filter now correctly detects and reports failures from custom post-renderers, restoring a crucial safety mechanism and providing users with actionable error messages.

Highlights

  • Regression Fix for Helm Post-Renderer Error Handling: Addressed a critical regression in Skaffold v2 where user-provided Helm post-renderers failing with a non-zero exit code were not properly detected, leading to silent deployment failures and potential data loss.
  • Enhanced Error Detection: Implemented checks for post-renderer exit codes within skaffold filter, ensuring that deployment pipelines are aborted and clear error messages are returned when a post-renderer crashes.
  • Consistent Error Messaging: Standardized the error message format to match Helm's native error reporting for post-renderer failures, providing a familiar debugging experience.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cmd/skaffold/app/cmd/filter.go
    • Imported the bytes package to facilitate error stream capture.
    • Configured cmd.Stderr to capture standard error output from the post-renderer process.
    • Added cmd.Wait() to explicitly check the post-renderer's exit status and return an error if it fails, including its captured stderr output.
  • cmd/skaffold/app/cmd/filter_test.go
    • Added TestFilterPostRendererFailure to test the new error handling for failing post-renderers, asserting that the correct error message is returned upon failure.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly identifies and fixes a regression where the exit code of a Helm post-renderer was not being checked. The addition of cmd.Wait() and checking its error is the right approach. However, the current implementation has a potential deadlock issue if manifest parsing fails mid-stream. I've suggested a more robust implementation that buffers the output to prevent this. The added test case is good and covers the failure scenario well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skaffold ignores helm post renderer's errors and destroys deployment

1 participant