Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis Template does not pass after analysis template is fixed #2314

Closed
4 tasks done
davidblum opened this issue Jul 18, 2024 · 6 comments
Closed
4 tasks done

Analysis Template does not pass after analysis template is fixed #2314

davidblum opened this issue Jul 18, 2024 · 6 comments

Comments

@davidblum
Copy link

Checklist

  • I've searched the issue queue to verify this is not a duplicate bug report.
  • I've included steps to reproduce the bug.
  • I've pasted the output of kargo version.
  • I've pasted logs, if applicable.

Description

While building a demo demonstrating how analysis templates can halt automatic promotions, I noticed that after i "fixed" an intentionally "bad" analysis template, and re-ran the stage, the next stage was still blocked.

I have 3 stages configured(00, 01, 02), and each stage is dependent on the previous stage (except for stage00, this is dependent on the warehouse).

I intentionally set stage 01 to use the "bad" (designed to fail) analysis template.

I run the promotion, Kargo halts at stage 01.

I update stage 01 to point to the "good" (designed to pass) analysis template, and apply the same git commit to stage 01. This time, the analysis' template still fails, and stage 02 is still blocked. I expect the second run with the good analysis template to pass, but it does not.

The only way i have "unblocked" is by pushing a new commit to the branch the Kargo is tracking. In this case, i pushed an arbitrary label change. Then ran the promotion from stage01 and I was able to promote the code to stage 02

Screenshots

Stage01 updated with "svc-good":
image

Stage 01 re-run with anaylysis-template "svc-good":

image

Steps to Reproduce

Create 3 stages.
Stage 00 subscribes to the warehouse
Stage 01 subscribes to stage 00
Stage 02 subscribes to Stage 01

Stage00 uses a "good" analysis template
Stage01 uses a "bad" analysis template
Stage02 uses a "good" analysis template

Promote the code to stage 00,01. It should fail to pass stage 01.

Update stage01 to use the "good" analysis template, rerun the promotion.

Stage02 will still not "unblock" even though the analysis template passes.

Version

14:36:48 ❯ kargo version
Client Version: v0.7.1
Server Version: v0.7.1

Logs

I was unable to find any logs showing the analysistemplates being run

@krancour
Copy link
Member

I was unable to find any logs showing the analysistemplates being run

That's the key to your entire problem.

Verification isn't directly tied into the promotion process; rather it is part of the Stage's lifecycle. I think we need to do more to clarify this / make it more obvious.

After you re-promoted the same piece of Freight and the promotion succeeded, the Stage's state hasn't really changed. When the Stage reconciler looks to see if there is a need for it to kick off verification for the Stage + its current collection of Freight, it will find a verification has already run for this combination (and failed). It assumes another attempt to verify the same Stage and same collection of Freight is going to yield the same result, so it does not automatically kick of verification in this case.

If instead of re-promoting, you re-verify, that is an explicit request to re-attempt verification, and I believe you will see the expected results.

@davidblum
Copy link
Author

Hey Kent! Thanks for your reply.

Verification isn't directly tied into the promotion process; rather it is part of the Stage's lifecycle. I think we need to do more to clarify this / make it more obvious.

This part is a bit confusing.

How would you suggest handling the case where the analysis template or verification step itself is broken?

IE we have a bad test, I push a fix, and then I re-deploy the blocked stage.

I would expect the verification to be re-run and pass, not require an entirely new deploy. The reasoning being is that the failed validation came from the analysisTemplate, not the code being deployed.

@krancour
Copy link
Member

@davidblum I think we're already on the same page without realizing it...

You don't need to re-promote. You can just trigger another verification attempt. 😄

kargo verify stage <stage> -p <project>

Or:

Screenshot 2024-07-23 at 11 26 15 AM

@davidblum
Copy link
Author

@krancour I see what you mean!

However, I did find a discrepancy.

I'm running version 0.7.1 and I do not have the reverify button.

However, if I use the CLI, I can issue another verify after the the analysisTemplate is "fixed".

Thanks for much for helping me understand this.

I totally agree, we were saying the same thing! Perhaps some additional docs could help avoid a similar confusion in the future.

@krancour
Copy link
Member

@davidblum sorry about that! The missing re-verify button in v0.7.1 was a bug that was fixed in #2287, which made it into v0.8.0.

@krancour
Copy link
Member

Created #2339 to track a doc enhancement to cover kargo verify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants