Skip to content

Conversation

@Sithembiso-Mashinini
Copy link

@Sithembiso-Mashinini Sithembiso-Mashinini commented Oct 22, 2025

Issue #, if available:

Description of changes:

Fixes node tainting failures when UseProviderId is enabled and Kubernetes node names differ from AWS PrivateDnsNames (e.g., clusters using instance IDs as node names).

PreDrainTask functions in SQS event handlers used interruptionEvent.NodeName (AWS PrivateDnsName) directly, causing tainting to fail when the Kubernetes node name didn't match. Drain/cordon operations worked because they resolved the correct node name via provider ID lookup

Example Failure

  {"level":"info","node-name":"ip-10-2-143-153.ec2.internal","instance-id":"i-0457d1e6522ad899c","provider-id":"aws:///us-east-1d/i-0457d1e6522ad899c","message":"Requesting instance drain"}
  {"level":"debug","target_provider_id":"aws:///us-east-1d/i-0457d1e6522ad899c","message":"Looking up node by ProviderID"}
  {"level":"debug","node_name":"i-0457d1e6522ad899c","node_provider_id":"aws:///us-east-1d/i-0457d1e6522ad899c","match":true,"message":"Checking node"}
  {"level":"debug","found_node":"i-0457d1e6522ad899c","message":"Returning node name"}
  {"level":"warn","node_name":"ip-10-2-143-153.ec2.internal","label_selector":"kubernetes.io/hostname in (ip-10-2-143-153,ip-10-2-143-153.ec2.internal)","matching_nodes":0,"message":"No nodes found with label selector"}
  {"level":"error","error":"nodes ip-10-2-143-153.ec2.internal not found","node_name":"ip-10-2-143-153.ec2.internal","message":"Failed to get node directly"}
  {"level":"error","error":"Unable to fetch kubernetes node from API: nodes ip-10-2-143-153.ec2.internal not found","message":"unable to taint node"}
  {"level":"info","node_name":"i-0457d1e6522ad899c","message":"Draining the node"}  // Works fine
  {"level":"info","node_name":"i-0457d1e6522ad899c","message":"Node successfully cordoned and drained"}  // Works fine

How you tested your changes:
Environment Linux:
Kubernetes Version: v1.31.5

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@Sithembiso-Mashinini Sithembiso-Mashinini requested a review from a team as a code owner October 22, 2025 12:07
@Sithembiso-Mashinini
Copy link
Author

@tiationg-kho nudge on this 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant