Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add AWSMachines to back the ec2 instances in AWSMachinePools #4527

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1103,6 +1103,10 @@ spec:
can be added as events to the Machine object and/or logged in the
controller's output.
type: string
infrastructureMachineKind:
description: InfrastructureMachineKind is the kind of the infrastructure
resources behind MachinePool Machines.
type: string
instances:
description: Instances contains the status for each instance in the
pool
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1043,6 +1043,10 @@ spec:
can be added as events to the MachinePool object and/or logged in the
controller's output.
type: string
infrastructureMachineKind:
description: InfrastructureMachineKind is the kind of the infrastructure
resources behind MachinePool Machines.
type: string
launchTemplateID:
description: The ID of the launch template
type: string
Expand Down
9 changes: 9 additions & 0 deletions config/rbac/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,14 @@ rules:
- cluster.x-k8s.io
resources:
- machines
verbs:
- delete
- get
- list
- watch
- apiGroups:
- cluster.x-k8s.io
resources:
- machines/status
verbs:
- get
Expand Down Expand Up @@ -310,6 +318,7 @@ rules:
resources:
- awsmachines
verbs:
- create
- delete
- get
- list
Expand Down
24 changes: 17 additions & 7 deletions controllers/awsmachine_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -143,10 +143,11 @@ func (r *AWSMachineReconciler) getObjectStoreService(scope scope.S3Scope) servic
return s3.NewService(scope)
}

// +kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=awsmachines,verbs=get;list;watch;update;patch;delete
// +kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=awsmachines/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=controlplane.cluster.x-k8s.io,resources=*,verbs=get;list;watch
// +kubebuilder:rbac:groups=cluster.x-k8s.io,resources=machines;machines/status,verbs=get;list;watch
// +kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=awsmachines,verbs=create;get;list;watch;update;patch;delete
// +kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=awsmachines/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=cluster.x-k8s.io,resources=machines,verbs=get;list;watch;delete
// +kubebuilder:rbac:groups=cluster.x-k8s.io,resources=machines/status,verbs=get;list;watch
// +kubebuilder:rbac:groups="",resources=secrets;,verbs=get;list;watch
// +kubebuilder:rbac:groups="",resources=namespaces,verbs=get;list;watch
// +kubebuilder:rbac:groups="",resources=events,verbs=get;list;watch;create;update;patch
Expand Down Expand Up @@ -459,6 +460,7 @@ func (r *AWSMachineReconciler) findInstance(machineScope *scope.MachineScope, ec
return instance, nil
}

//nolint:gocyclo
func (r *AWSMachineReconciler) reconcileNormal(_ context.Context, machineScope *scope.MachineScope, clusterScope cloud.ClusterScoper, ec2Scope scope.EC2Scope, elbScope scope.ELBScope, objectStoreScope scope.S3Scope) (ctrl.Result, error) {
machineScope.Trace("Reconciling AWSMachine")

Expand All @@ -482,7 +484,7 @@ func (r *AWSMachineReconciler) reconcileNormal(_ context.Context, machineScope *
}

// Make sure bootstrap data is available and populated.
if machineScope.Machine.Spec.Bootstrap.DataSecretName == nil {
if !machineScope.IsMachinePoolMachine() && machineScope.Machine.Spec.Bootstrap.DataSecretName == nil {
machineScope.Info("Bootstrap data secret reference is not yet available")
conditions.MarkFalse(machineScope.AWSMachine, infrav1.InstanceReadyCondition, infrav1.WaitingForBootstrapDataReason, clusterv1.ConditionSeverityInfo, "")
return ctrl.Result{}, nil
Expand All @@ -497,6 +499,12 @@ func (r *AWSMachineReconciler) reconcileNormal(_ context.Context, machineScope *
conditions.MarkUnknown(machineScope.AWSMachine, infrav1.InstanceReadyCondition, infrav1.InstanceNotFoundReason, err.Error())
return ctrl.Result{}, err
}
if instance == nil && machineScope.IsMachinePoolMachine() {
err = errors.New("no instance found for machine pool")
machineScope.Error(err, "unable to find instance")
conditions.MarkUnknown(machineScope.AWSMachine, infrav1.InstanceReadyCondition, infrav1.InstanceNotFoundReason, err.Error())
Comment on lines +503 to +505
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you tested the new feature, did this code block spam the logs with lots of errors? I can imagine that a missing EC2 instance can happen a lot if the ASG terminates instances. Or does this not happen because CAPA watches for deletion events?

Wondering because findInstance has no special logic for a "not found" error which would allow us to swallow those.

And shouldn't we delete the Machine (and thereby InfraMachine) in this case (EC2 instance went away, so it shouldn't be represented as object anymore)? Otherwise the object may dangle indefinitely.

(TODO / note to self: the AWSMachinePool reconciler may do that, check in my code review)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not recall being spammed with errors, but I last tested this cod in a cluster 6 months ago.

return ctrl.Result{}, err
}

// If the AWSMachine doesn't have our finalizer, add it.
if controllerutil.AddFinalizer(machineScope.AWSMachine, infrav1.MachineFinalizer) {
Expand Down Expand Up @@ -596,9 +604,11 @@ func (r *AWSMachineReconciler) reconcileNormal(_ context.Context, machineScope *
}

// reconcile the deletion of the bootstrap data secret now that we have updated instance state
if deleteSecretErr := r.deleteBootstrapData(machineScope, clusterScope, objectStoreScope); deleteSecretErr != nil {
r.Log.Error(deleteSecretErr, "unable to delete secrets")
return ctrl.Result{}, deleteSecretErr
if !machineScope.IsMachinePoolMachine() {
if deleteSecretErr := r.deleteBootstrapData(machineScope, clusterScope, objectStoreScope); deleteSecretErr != nil {
r.Log.Error(deleteSecretErr, "unable to delete secrets")
return ctrl.Result{}, deleteSecretErr
}
}

if instance.State == infrav1.InstanceStateTerminated {
Expand Down
12 changes: 11 additions & 1 deletion exp/api/v1beta1/conversion.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ func (src *AWSMachinePool) ConvertTo(dstRaw conversion.Hub) error {
if restored.Spec.AvailabilityZoneSubnetType != nil {
dst.Spec.AvailabilityZoneSubnetType = restored.Spec.AvailabilityZoneSubnetType
}
dst.Status.InfrastructureMachineKind = restored.Status.InfrastructureMachineKind

if restored.Spec.AWSLaunchTemplate.PrivateDNSName != nil {
dst.Spec.AWSLaunchTemplate.PrivateDNSName = restored.Spec.AWSLaunchTemplate.PrivateDNSName
Expand Down Expand Up @@ -80,7 +81,6 @@ func (src *AWSMachinePoolList) ConvertTo(dstRaw conversion.Hub) error {
// ConvertFrom converts the v1beta2 AWSMachinePoolList receiver to v1beta1 AWSMachinePoolList.
func (r *AWSMachinePoolList) ConvertFrom(srcRaw conversion.Hub) error {
src := srcRaw.(*infrav1exp.AWSMachinePoolList)

return Convert_v1beta2_AWSMachinePoolList_To_v1beta1_AWSMachinePoolList(src, r, nil)
}

Expand Down Expand Up @@ -110,6 +110,8 @@ func (src *AWSManagedMachinePool) ConvertTo(dstRaw conversion.Hub) error {
dst.Spec.AvailabilityZoneSubnetType = restored.Spec.AvailabilityZoneSubnetType
}

dst.Status.InfrastructureMachineKind = restored.Status.InfrastructureMachineKind

return nil
}

Expand All @@ -129,6 +131,14 @@ func Convert_v1beta2_AWSManagedMachinePoolSpec_To_v1beta1_AWSManagedMachinePoolS
return autoConvert_v1beta2_AWSManagedMachinePoolSpec_To_v1beta1_AWSManagedMachinePoolSpec(in, out, s)
}

func Convert_v1beta2_AWSMachinePoolStatus_To_v1beta1_AWSMachinePoolStatus(in *infrav1exp.AWSMachinePoolStatus, out *AWSMachinePoolStatus, s apiconversion.Scope) error {
return autoConvert_v1beta2_AWSMachinePoolStatus_To_v1beta1_AWSMachinePoolStatus(in, out, s)
}

func Convert_v1beta2_AWSManagedMachinePoolStatus_To_v1beta1_AWSManagedMachinePoolStatus(in *infrav1exp.AWSManagedMachinePoolStatus, out *AWSManagedMachinePoolStatus, s apiconversion.Scope) error {
return autoConvert_v1beta2_AWSManagedMachinePoolStatus_To_v1beta1_AWSManagedMachinePoolStatus(in, out, s)
}

// ConvertTo converts the v1beta1 AWSManagedMachinePoolList receiver to a v1beta2 AWSManagedMachinePoolList.
func (src *AWSManagedMachinePoolList) ConvertTo(dstRaw conversion.Hub) error {
dst := dstRaw.(*infrav1exp.AWSManagedMachinePoolList)
Expand Down
32 changes: 12 additions & 20 deletions exp/api/v1beta1/zz_generated.conversion.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions exp/api/v1beta2/awsmachinepool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,10 @@ type AWSMachinePoolStatus struct {
// +optional
LaunchTemplateVersion *string `json:"launchTemplateVersion,omitempty"`

// InfrastructureMachineKind is the kind of the infrastructure resources behind MachinePool Machines.
// +optional
InfrastructureMachineKind string `json:"infrastructureMachineKind,omitempty"`

// FailureReason will be set in the event that there is a terminal problem
// reconciling the Machine and will contain a succinct value suitable
// for machine interpretation.
Expand Down
4 changes: 4 additions & 0 deletions exp/api/v1beta2/awsmanagedmachinepool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,10 @@ type AWSManagedMachinePoolStatus struct {
// +optional
LaunchTemplateVersion *string `json:"launchTemplateVersion,omitempty"`

// InfrastructureMachineKind is the kind of the infrastructure resources behind MachinePool Machines.
// +optional
InfrastructureMachineKind string `json:"infrastructureMachineKind,omitempty"`

// FailureReason will be set in the event that there is a terminal problem
// reconciling the MachinePool and will contain a succinct value suitable
// for machine interpretation.
Expand Down
5 changes: 5 additions & 0 deletions exp/api/v1beta2/conditions_consts.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ const (
InstanceRefreshNotReadyReason = "InstanceRefreshNotReady"
// InstanceRefreshFailedReason used to report when there instance refresh is not initiated.
InstanceRefreshFailedReason = "InstanceRefreshFailed"

// AWSMachineCreationFailed reports if creating AWSMachines to represent ASG (machine pool) machines failed.
AWSMachineCreationFailed = "AWSMachineCreationFailed"
// AWSMachineDeletionFailed reports if deleting AWSMachines failed.
AWSMachineDeletionFailed = "AWSMachineDeletionFailed"
)

const (
Expand Down
5 changes: 5 additions & 0 deletions exp/api/v1beta2/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ import (
infrav1 "sigs.k8s.io/cluster-api-provider-aws/v2/api/v1beta2"
)

const (
// KindMachinePool is a MachinePool resource Kind
KindMachinePool string = "MachinePool"
)

// EBS can be used to automatically set up EBS volumes when an instance is launched.
type EBS struct {
// Encrypted is whether the volume should be encrypted or not.
Expand Down
Loading