Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(orca-clouddriver): make Deploy Manifest stage work with v4 spel evaluator and manifests contained in spel expressions #4823

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

apoorvmahajan
Copy link
Contributor

@apoorvmahajan apoorvmahajan commented Jan 11, 2025

Motivation

We don't need to map the DeployManifestStage context to the DeployManifestStageContext all the time. This causes issues with the v4 spel evaluator where the context may have spel expressions that haven't been evaluated yet (like the manifests key).

Changes

This PR fixes the DeployManifestStage to not deserialize the entire context unless needed, but only fetch certain things from the context. This makes it work for pipelines with v4 spel evaluator containing manifests that are in spel expressions.

Testing

  • existing unit tests still pass

…valuator and manifests contained in spel expressions
@dbyron-sf
Copy link
Contributor

Hey there. Good to hear from you!

I'm all for reducing unnecessary work, but I could use a bit more explanation, and ideally yes a new automated test. How do SpEL expressions in manifests cause problems for deserialization? I can sort of see it, but you give an example?

@@ -92,7 +93,7 @@ public boolean processExpressions(
@Override
public void afterStages(@Nonnull StageExecution stage, @Nonnull StageGraphBuilder graph) {
TrafficManagement trafficManagement =
stage.mapTo(DeployManifestContext.class).getTrafficManagement();
stage.mapTo("/trafficManagement", TrafficManagement.class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbyron-sf

Good to hear from you as well!

I think this example captures the problem:

  • Create a pipeline with spel evaluator: v4. Add a Deploy Manifest stage with Manifest Source as Text. For the manifest, add a spel expression that is not correct. For example ${someManifestThatIsSkipped} .
  • Make sure that you have a stage enabled spel expression such that it skips this stage upon execution.

Expected behavior is:

  • the stage shouldn't run because of the stageEnabled spel evaluating to false. And the pipeline should be successful.

Actual Behaviour:

  • the stage fails because it tries to map the entire stage instead of just the relevant bits in the places added in this PR. And the pipeline fails as well.

Hope that helps. I'll try to add an automated test capturing this behavior once I find some time

Copy link
Contributor

@dbyron-sf dbyron-sf Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this totally helps. Big picture it feels like there's a tension between SpEL expression evaluation and how the stage skipping logic works. Like, if a stage has:

  "stageEnabled": {
    "expression": "false",
    "type": "expression"
  },

it seems reasonable to argue that SpEL expression evaluation wouldn't happen in the rest of the stage. Even with your fix (which I like), the stage would still fail if there was some busted SpEL expression in trafficManagement. Fixing that seems like a bigger change though.

For the record, I reproduced this with:

{
  "appConfig": {},
  "keepWaitingPipelines": false,
  "lastModifiedBy": "dbyron@salesforce.com",
  "limitConcurrent": true,
  "spelEvaluator": "v4",
  "stages": [
    {
      "account": "my-account",
      "cloudProvider": "kubernetes",
      "manifests": [
        "${bustedExpression}"
      ],
      "moniker": {
        "app": "dbyrontest"
      },
      "name": "Deploy (Manifest)",
      "refId": "1",
      "requisiteStageRefIds": [],
      "skipExpressionEvaluation": false,
      "source": "text",
      "stageEnabled": {
        "expression": "false",
        "type": "expression"
      },
      "trafficManagement": {
        "enabled": false,
        "options": {
          "enableTraffic": false,
          "services": []
        }
      },
      "type": "deployManifest"
    }
  ],
  "triggers": [],
  "updateTs": "1736899563161"
}

Switching to "spelEvaluator": "v3", made the pipeline succeed, with the stage skipped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that may be correct, it could still fail in the traffic management piece, haven't tried it myself. There are possibly two more issues with the v4 spel evaluator:
1 - it does inconsistent overrides of vars set in later stages (i.e. set one var in one evaluate variable stage, and override it later, it doesn't always pick up the overridden value)
2 - it doesn't seem to work nicely with the notifications spel expressions. Complex spel expressions don't always always work, and conditionally evaluating which notification to run (on stage complete/failed/starting) fails if the spel contains a variable set in that stage itself.

so long story short, yeah there are more potential problems with v4, which we can attempt to tackle in a later PR imo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there I thought v4 was such an improvement over v3...at least that evaluation happened more predictably, and at the right time...there are so many mysteries.

Copy link
Contributor Author

@apoorvmahajan apoorvmahajan Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, it is still (much) better than v3 imo just for the at the right time capability. It helps in reducing the number of evaluate variable stages you need in a pipeline by enabling reuse of variables in other spel expressions in that same stage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to make things better, that I think is necessary even if there weren't any problems with v4 is....allowing for nested SpEL expressions. In other words, if the result of evaluating one SpEL expression is another SpEL expression, evaluate that one too. It might also paper over some of the struggles with v4. I think this change is actually fairly straightforward, but we haven't gotten to it yet. Pretty sure the way to do that is to add

if (isExpression(result))

here in ExpressionTransform.transform, and call transform recursively (with appropriate config flags to enable / limit the depth).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants