fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel evaluator and manifests contained in spel expressions #4823

apoorvmahajan · 2025-01-11T03:21:49Z

Motivation

We don't need to map the DeployManifestStage context to the DeployManifestStageContext all the time. This causes issues with the v4 spel evaluator where the context may have spel expressions that haven't been evaluated yet (like the manifests key).

Changes

This PR fixes the DeployManifestStage to not deserialize the entire context unless needed, but only fetch certain things from the context. This makes it work for pipelines with v4 spel evaluator containing manifests that are in spel expressions.

Testing

existing unit tests still pass

…valuator and manifests contained in spel expressions

dbyron-sf · 2025-01-11T03:44:55Z

Hey there. Good to hear from you!

I'm all for reducing unnecessary work, but I could use a bit more explanation, and ideally yes a new automated test. How do SpEL expressions in manifests cause problems for deserialization? I can sort of see it, but you give an example?

apoorvmahajan · 2025-01-14T22:43:18Z

.../main/java/com/netflix/spinnaker/orca/clouddriver/pipeline/manifest/DeployManifestStage.java

@@ -92,7 +93,7 @@ public boolean processExpressions(
  @Override
  public void afterStages(@Nonnull StageExecution stage, @Nonnull StageGraphBuilder graph) {
    TrafficManagement trafficManagement =
-        stage.mapTo(DeployManifestContext.class).getTrafficManagement();
+        stage.mapTo("/trafficManagement", TrafficManagement.class);


@dbyron-sf

Good to hear from you as well!

I think this example captures the problem:

Create a pipeline with spel evaluator: v4. Add a Deploy Manifest stage with Manifest Source as Text. For the manifest, add a spel expression that is not correct. For example ${someManifestThatIsSkipped} .

Make sure that you have a stage enabled spel expression such that it skips this stage upon execution.

Expected behavior is:

the stage shouldn't run because of the stageEnabled spel evaluating to false. And the pipeline should be successful.

Actual Behaviour:

the stage fails because it tries to map the entire stage instead of just the relevant bits in the places added in this PR. And the pipeline fails as well.

Hope that helps. I'll try to add an automated test capturing this behavior once I find some time

Yeah, this totally helps. Big picture it feels like there's a tension between SpEL expression evaluation and how the stage skipping logic works. Like, if a stage has:

"stageEnabled": { "expression": "false", "type": "expression" },

it seems reasonable to argue that SpEL expression evaluation wouldn't happen in the rest of the stage. Even with your fix (which I like), the stage would still fail if there was some busted SpEL expression in trafficManagement. Fixing that seems like a bigger change though.

For the record, I reproduced this with:

{ "appConfig": {}, "keepWaitingPipelines": false, "lastModifiedBy": "dbyron@salesforce.com", "limitConcurrent": true, "spelEvaluator": "v4", "stages": [ { "account": "my-account", "cloudProvider": "kubernetes", "manifests": [ "${bustedExpression}" ], "moniker": { "app": "dbyrontest" }, "name": "Deploy (Manifest)", "refId": "1", "requisiteStageRefIds": [], "skipExpressionEvaluation": false, "source": "text", "stageEnabled": { "expression": "false", "type": "expression" }, "trafficManagement": { "enabled": false, "options": { "enableTraffic": false, "services": [] } }, "type": "deployManifest" } ], "triggers": [], "updateTs": "1736899563161" }

Switching to "spelEvaluator": "v3", made the pipeline succeed, with the stage skipped.

yeah that may be correct, it could still fail in the traffic management piece, haven't tried it myself. There are possibly two more issues with the v4 spel evaluator:
1 - it does inconsistent overrides of vars set in later stages (i.e. set one var in one evaluate variable stage, and override it later, it doesn't always pick up the overridden value)
2 - it doesn't seem to work nicely with the notifications spel expressions. Complex spel expressions don't always always work, and conditionally evaluating which notification to run (on stage complete/failed/starting) fails if the spel contains a variable set in that stage itself.

so long story short, yeah there are more potential problems with v4, which we can attempt to tackle in a later PR imo

And there I thought v4 was such an improvement over v3...at least that evaluation happened more predictably, and at the right time...there are so many mysteries.

fwiw, it is still (much) better than v3 imo just for the at the right time capability. It helps in reducing the number of evaluate variable stages you need in a pipeline by enabling reuse of variables in other spel expressions in that same stage

One way to make things better, that I think is necessary even if there weren't any problems with v4 is....allowing for nested SpEL expressions. In other words, if the result of evaluating one SpEL expression is another SpEL expression, evaluate that one too. It might also paper over some of the struggles with v4. I think this change is actually fairly straightforward, but we haven't gotten to it yet. Pretty sure the way to do that is to add

if (isExpression(result))

here in ExpressionTransform.transform, and call transform recursively (with appropriate config flags to enable / limit the depth).

fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel e…

0c9462d

…valuator and manifests contained in spel expressions

apoorvmahajan commented Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel evaluator and manifests contained in spel expressions #4823

fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel evaluator and manifests contained in spel expressions #4823

apoorvmahajan commented Jan 11, 2025 •

edited

Loading

dbyron-sf commented Jan 11, 2025

apoorvmahajan Jan 14, 2025

dbyron-sf Jan 15, 2025 •

edited

Loading

apoorvmahajan Jan 15, 2025

dbyron-sf Jan 15, 2025

apoorvmahajan Jan 15, 2025 •

edited

Loading

dbyron-sf Jan 15, 2025

fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel evaluator and manifests contained in spel expressions #4823

Are you sure you want to change the base?

fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel evaluator and manifests contained in spel expressions #4823

Conversation

apoorvmahajan commented Jan 11, 2025 • edited Loading

Motivation

Changes

Testing

dbyron-sf commented Jan 11, 2025

apoorvmahajan Jan 14, 2025

Choose a reason for hiding this comment

dbyron-sf Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

apoorvmahajan Jan 15, 2025

Choose a reason for hiding this comment

dbyron-sf Jan 15, 2025

Choose a reason for hiding this comment

apoorvmahajan Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

dbyron-sf Jan 15, 2025

Choose a reason for hiding this comment

apoorvmahajan commented Jan 11, 2025 •

edited

Loading

dbyron-sf Jan 15, 2025 •

edited

Loading

apoorvmahajan Jan 15, 2025 •

edited

Loading