-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimal level-triggered controller #200
Conversation
6b0d78d
to
48e01d2
Compare
48e01d2
to
273d779
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, only the module placement of the new controller.
When/if we drop the original logic, make sure we move the controller under controlers
.
Or even better if we move it now and rename it controllers.NewLevelTriggeringPipelineReconciler
. It's a common code structure to store controllers under controllers
and it bothers me to see the new one under a different module. controllers
directory is plural.
42812af
to
f734e82
Compare
We can move the code when the old one is fully deprecated, yep. The forked controller went in its own package mainly because the tests rely on a TestMain which sets the controller running, and it is easier and cleaner if the two controller implementations have their own TestMain. If the implementations being in controllers/ and a subdirectory of controllers/ is the objectionable bit, I could move the old one to e.g., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great as a basis for making progress. Is there anything else we can deal with it later.
We want to react to clusters changing, especially if they are going from an unready state to a ready state. This fixes some small but egregious problems with how that works: - the helper funcs had misleading names - the key, though an arbitrary string, was misleading - the index lookup did some unnecessary work in compiling a list of deep copies of query results Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
As it stood, the controller logic would send an event with reason="SetStatusConditionError" when the cluster was not found and setting the status _succeeded_. This commit rearranges this logic so that an event with reason=GetClusterError is sent if there's any problem resolving the clusterRef, and reason=SetStatusConditionError is only sent if the status condition could not be set. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
This adds checks to the pipeline test to make sure that a pipeline will be re-examined when a cluster it refers to is 1. created and 2. reaches a ready state. This relies on the indexing working, since the test timeout is much shorter than the default requeue time (after which the pipeline would be re-examined anyway). Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
I want the new behaviour to be guarded by a feature flag, and rather than sprinkling conditionals through the controller code, I'd prefer to have a completely separate set of code and dispatch to it in main.go. A reasonable starting point is the pretty minimal existing controller code. So: copy the controller code into a new package and adjust as needed to get go test -v ./controllers/leveltriggered to succeed. This commit also makes the test main wait for the manager to shut down. This is minorly superstitious on my part: my experience is that failure to shut managers down cleanly can cause problems with subsequent test runs. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
The level-triggered promotion algorithm will need to know the status of each target in the pipeline, so it can make a decision on if and where a promotion needs to be triggered. To be explicit about how it makes that decision, that controller should record the status it's seen for each target. This will help with testing, and makes the Pipeline object self-contained -- you should be able to see from the status where the pipeline is up to. This commit introduces a couple more types into the API to represent the status of a target. Each one gets a full name (the type, namespace, name, and cluster if it's in another cluster). When the controller goes through the environments, it fetches the targets apps. For the sake of not making too many decisions at once, I've implemented only the case of HelmRelease objects in the local cluster. Once there are tests demonstrating it works, it can be generalised to other app types (the API accepts _any_ API group, version, and kind). The test involving a GitOpsCluster is skipped, because we would now expect to fetch the application object from the remote cluster it represents, rather than just examine the cluster object itself. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
To date, the controller didn't have much to do other than validate that it could see the cluster objects mentioned in a pipeline object (which wasn't itself necessary, since taking actions relied on the information given in webhooks). Its idea of "Ready" was that it could see all the clusters, and they were themselves ready. If any cluster was not present, it could exit early as unready. Now, however, we want to actually go and fetch the status of the targets mentioned in the pipeline. "Ready" must have a different meaning: it means the controller can reach all the targets, and thereby determine the state of play for the pipeline. It could exit early in principle, but I have preferred to assess all the targets even if some are unavailable, and give a summary in the ready condition at the end. The additional test will fail for now, since it checks that adding a previously missing app object will result in a ready pipeline, but there's nothing to trigger the pipeline to be re-examined, yet. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
This adds indexing to the level-triggered controller, and watches `HelmRelease` objects. If a HelmRelease object changes, it's looked up against the index, and any pipeline using it will be queued to be reconciled. This is enough to pass the unit test that checks if a Pipeline becomes ready if it can see all its targets. The indexing is general with respect to the application kind, because it includes the Group and Kind in the index key. But: - the watch is for the HelmRelease type specifically; and, - it does not account for remote clusters. The next aim will be to make scaffolding that watches any application, when it's mentioned in a pipeline. Remote clusters will come later. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
Eventually, this code will need to be able to query for targets in remote clusters. To make room for that, I've done a small rearrangement here such that the client used for a target is obtained from a procedure, rather than assumed to be the client for the local cluster. This change is more to check this structure makes sense, than to alter the behaviour. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
This commit expands the test a little to make sure that when an app (HelmRelease, here) has its status updated, the pipeline's status is updated in sympathy. In making the test pass, this corrects a bug with the implementation so far: when going through the environments in the controller, the environment status was not entered into the map of environment statuses. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
The v1alpha1 CRD has new fields as part of this change. These are intended to be backward-compatible, but since they are nonetheless changed and are included in the chart, the chart version should be incremented. Signed-off-by: Michael Bridgen <michael.bridgen@weave.works>
f734e82
to
c2403e0
Compare
This does the preliminary work requested in #194, so that the implementation of level-triggered pipelines can make progress in several (or at least two) directions. After this is merged, we have a controller that checks the status of its targets, so the promotion algorithm can be implemented (#195).
This does just (about) enough to build the promotion algorithm on top, and thereby get to a usable controller. I made simplifications to get here reasonably quickly:
Removing those limitations is the work of #196.
I put the new implementation behind a feature flag, so we can work on it without having long-running branches. For many of the subsequent tasks in the umbrella issue #193 it will be enough to work in
controllers/leveltriggered/
and write tests there. When we want to work on GUI or other things needing a running controller process, the flag--enable-level-triggered
can be given to the binary to start the new controller code.Fixes #194. With reference to this criterion: "objects that aren't mentioned in a pipeline don't get watched" -- this implementation limits support to a specific type, and the local cluster, and uses the convenience builder API in controller-runtime, so avoids creating informers and watchers and so on dynamically. The expanded implementation (#196) will need to pay careful attention to this, though.