-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry failed runs with ObserveAndDelete policy #301
Retry failed runs with ObserveAndDelete policy #301
Conversation
cmd/provider/main.go
Outdated
@@ -42,6 +43,7 @@ func main() { | |||
timeout = app.Flag("timeout", "Controls how long Ansible processes may run before they are killed.").Default("20m").Duration() | |||
leaderElection = app.Flag("leader-election", "Use leader election for the controller manager.").Short('l').Default("false").OverrideDefaultFromEnvar("LEADER_ELECTION").Bool() | |||
maxReconcileRate = app.Flag("max-reconcile-rate", "The maximum number of concurrent reconciliation operations.").Default("1").Int() | |||
retryFailed = app.Flag("retry-failed", "Whether to retry failed runs with ObserveAndDelete policy (with CheckWhenObserve, they are retried unconditionally).").Default("false").Bool() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk if this flag is a good idea: retrying failed syncs, to me, is something that a k8s controller should always do, it shouldn't even be configurable. And it's unconditionally retried with CheckWhenObserve policy.
But there might be provider-ansible users who already rely on the existing no-retries behavior, which is the only reason I added the flag. LMK what you'd prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. retryFailed seems to be the usual case most of the time.
there might be provider-ansible users who already rely on the existing no-retries behavior
Do you have a concrete use case for this? If not, I'd prefer taking this as default behavior and open another issue to track the potential necessity of introducing this flag.
BTW: It looks the retryFailed flag is not really being used through out the code in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 yup forgot to consume it in handleLastApplied
Anyway, removed the flag
cmd/provider/main.go
Outdated
@@ -76,6 +78,13 @@ func main() { | |||
Features: &feature.Flags{}, | |||
} | |||
|
|||
kingpin.FatalIfError(ansible.Setup(mgr, o, *ansibleCollectionsPath, *ansibleRolesPath, *timeout), "Cannot setup Ansible controllers") | |||
ao := ansiblerun.SetupOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of parameters that apply only to the ansiblerun controller and not to the config controller was getting too long as I added another flag, so I decided to refactor a bit and out this into a struct
} | ||
|
||
if !isUpToDate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole block has just changed indentation cause it was moved out of the if
. There aren't actually any logic changes
@morningspace ready for review 🙏 |
@d-honeybadger There are some lint errors and conflicts need to be resolved. |
600c974
to
8fc237f
Compare
Signed-off-by: Dasha Komsa <komsa.darya@gmail.com>
Signed-off-by: Dasha Komsa <komsa.darya@gmail.com>
Signed-off-by: Dasha Komsa <komsa.darya@gmail.com>
8fc237f
to
2d0c627
Compare
Signed-off-by: Dasha Komsa <komsa.darya@gmail.com>
7205cf6
to
f54bc38
Compare
@morningspace sorry about the linting errors! Turns out I commented out the linting line in the makefile cause at some point it was broken for me, and of course forgot that I did that... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks awesome.
Description of your changes
Fixes #295
On backoff: Since
Observe
returns error whenever a retry run fails, crossplane-runtime will automatically apply backoffWe don't have control over the interval, max steps etc., but it's going to be the same default backoff that crossplane-runtime applies whenever
Update
fails, so we should have the bahavior that matchesCheckWhenObserve
policy's updates, and also matches other providers.I have:
make reviewable
to ensure this PR is ready for review.backport release-x.y
labels to auto-backport this PR if necessary.How has this code been tested