-
Notifications
You must be signed in to change notification settings - Fork 47
Conversation
c567c95
to
13c8b03
Compare
Added extra commits, will squash them later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add test for all that. Considering that all rotation functionality is exposed using cluster
package, maybe we can import it for e2e tests and run the rotation from there? We don't need to test if rotation works, but we can test at least that it does not break the cluster :)
69664c2
to
dcc4781
Compare
I addressed all comments apart from a few that have to do with commits, will do those once I have e2e added to finalise the PR. |
598b2a4
to
e66add5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@meyskens @invidian nice work, thanks!
Sorry, I didn't manage to review it all, it is quite a long PR. I haven't finished the first commit yet (although it seems the rest is simpler). For the future, I think the first commit can be split a lot and be way simpler to review.
Something that is not yet clear to me (maybe it is because I haven't finished reviewing) but won't components using service accounts need to be restarted too? What happens to the running compoenents if we rotate the cluster CA? Do they fail immediately? Also, not sure what happens with components using service accounts in other rotation cases (like not CAs, etc.).
Thanks again for the changes, will continue to review later. Leaving this rounds of comments now, though :)
assets/terraform-modules/bare-metal/flatcar-linux/kubernetes/ssh.tf
Outdated
Show resolved
Hide resolved
assets/terraform-modules/packet/flatcar-linux/kubernetes/ssh.tf
Outdated
Show resolved
Hide resolved
e66add5
to
ede9672
Compare
Since service account tokens are volatile I do not think this is an issue in most cases as they apps that use them should be designed to pick up changes. To be sure i did an experiment with cert-manager and rotated the CA (validity 1hour to 1000hours) then watched it's behavior. It had some watchers close due to the api server going down but as it is designed to do it picks them up again and after the 1 hour passed the controller still works. The Kubernetes controller runtime has picked up the new CA certificate. |
bfd8587
to
cca7dba
Compare
assets/terraform-modules/packet/flatcar-linux/kubernetes/outputs.tf
Outdated
Show resolved
Hide resolved
assets/terraform-modules/packet/flatcar-linux/kubernetes/outputs.tf
Outdated
Show resolved
Hide resolved
f293aed
to
33705f5
Compare
This allows running terraform apply with -parallelism=1.
33705f5
to
a19ca4f
Compare
So we can specify what Deployments and DaemonSets each platform uses.
This package adds functions to roll out Deployments and DaemonSets, with proper waiting. This code is mostly taken from kubectl code so we mimic its behavior.
When any of certificates used by etcd is changed, trigger copy-controller-secrets null_resource to copy etcd certificates on all controller nodes and then restart etcd to ensure it picks up new certificates, as it can only automatically pick up new client certificates and private key, but not the CA certificate.
This exposes the certs_validity_period_hours option to the bare-metal module in lokocfg Signed-off-by: Maartje Eyskens <maartje@kinvolk.io>
a19ca4f
to
053ca05
Compare
This CI fail on Packet seems concerning:
|
It seems we do need the |
assets/terraform-modules/packet/flatcar-linux/kubernetes/outputs.tf
Outdated
Show resolved
Hide resolved
assets/terraform-modules/packet/flatcar-linux/kubernetes/outputs.tf
Outdated
Show resolved
Hide resolved
They were not set as output in Terraform for Packet this caused the control plane check of lokoctl and certificate rotation to error. Co-authored-by: Iago López Galeiras <iago@kinvolk.io> Signed-off-by: Maartje Eyskens <maartje@kinvolk.io>
So it can be reused when rotating certificates.
This commit adds automated certificate rotation using the cluster certificate rotate command. 'lokoctl cluster certificate rotate' taints all certificate resources and updates terraform to re-issue them, it waits for all service account token secrets to be updated with the new CA certificate, then triggers a restart on all system deployments to make sure they pick up new CA certificate. This is the core of the certificate rotation logic. Co-authored-by: Maartje Eyskens <maartje@kinvolk.io> Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io> Signed-off-by: Maartje Eyskens <maartje@kinvolk.io>
This adds an e2e test that rotates the certificates then checks: * if all kubelets are up * the kubeconfig got changed * the API serves a longer valid cert Signed-off-by: Maartje Eyskens <maartje@kinvolk.io>
053ca05
to
228f504
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if doing more reviews make sense at this point. If CI passes, I guess it's fine to merge.
Add a certificate rotation command
This builds on top of #1201.
It checks off the todo items and moves it into a separate command as well as add support for certificate rotation of etcd on all platforms.
Currently it will not rotate the private keys as that will cause some issues with the trust chain. It also doesn't properly set CA lengths which will allow for better rotation of the leaf certificates.
TODO
How to use
certs_validity_period_hours = 1
certs_validity_period_hours = <something bigger>
lokoctl cluster certificate rotate
Testing done