Skip to content

Conversation

tthvo
Copy link
Contributor

@tthvo tthvo commented Jul 30, 2025

What type of PR is this?
/kind feature

What this PR does / why we need it:

As of today, CAPA supports IPv6 on EKS, but not self-managed clusters. Thus, these changes bring IPv6 support for self-managed clusters, both single-stack IPv6 and dualstack.

Which issue(s) this PR fixes:

Fixes #2420
Fixes #3381

Special notes for your reviewer:

  • EC2 instance type: Only nitro-based instance type can support IPv6.
  • CNI: The CNI plugin needs to support IPV6. I include sample manifests in test/e2e/data/cni. Calico does not support IPv6 with "IP-in-IP" so we need to use VXLAN.

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:

Enable IPv6 support for self-managed kubernetes clusters

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Jul 30, 2025
Copy link

linux-foundation-easycla bot commented Jul 30, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot requested review from AndiDog and damdo July 30, 2025 21:03
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign nrb for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-priority cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 30, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @tthvo!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-aws 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-aws has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 30, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @tthvo. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 30, 2025
@tthvo
Copy link
Contributor Author

tthvo commented Jul 30, 2025

/cc @nrb @sadasu @patrickdillon

I am not yet sure what to do with e2e tests or if there are any existing ones for IPv6 clusters...I leave it as an pending TODO.

@k8s-ci-robot k8s-ci-robot requested a review from nrb July 30, 2025 21:23
@k8s-ci-robot
Copy link
Contributor

@tthvo: GitHub didn't allow me to request PR reviews from the following users: sadasu, patrickdillon.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @nrb @sadasu @patrickdillon

I am not yet sure what to do with e2e tests or if there are any existing ones for IPv6 clusters...I leave it as an pending TODO.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tthvo
Copy link
Contributor Author

tthvo commented Jul 30, 2025

A quick preview of kubectl get for an IPv6 self-managed cluster can be found here. So far, nodes, pods and services are assigned the expected IPv6 family IPs 😄

@tthvo tthvo force-pushed the singestack-ipv6 branch from ddacd99 to 13e4379 Compare July 31, 2025 00:37
@damdo
Copy link
Member

damdo commented Aug 4, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 4, 2025
@damdo
Copy link
Member

damdo commented Aug 4, 2025

/assign @mtulio

Asking you for a review Marco as I know you have been working on this downstream

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 4, 2025
@tthvo tthvo force-pushed the singestack-ipv6 branch from 13e4379 to 8816d87 Compare August 5, 2025 18:09
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025
@tthvo tthvo force-pushed the singestack-ipv6 branch from 8816d87 to 5dd2888 Compare August 5, 2025 18:11
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 7, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 7, 2025
@tthvo tthvo force-pushed the singestack-ipv6 branch 2 times, most recently from c70f341 to d7df0b9 Compare October 10, 2025 21:52
tthvo added 23 commits October 10, 2025 15:25
AWS requires that when registering targets by instance ID for an IPv6
target group, the targets must have an assigned primary IPv6 address.

Note: The default subnets managed by CAPA are already set up to assign
IPv6 addresses to newly created ENIs.
The httpProtocolIPv6 field enables or disables the IPv6 endpoint of the
instance metadata service. The SDK only applies this field if
httpEndpoint is enabled.

When running on single-stack IPv6, pods only have IPv6, thus requiring
an IPv6 endpoint to query IMDS as IPv4 network is unreachable.
In the case where egress-only-internet-gateway is deleted, CAPA
reconcilation loop will create a new one. Thus, CAPA needs to modify the
routes to point to the new eigw ID.
This allows IPv6-only workloads to reach IPv4-only services. AWS
supports this via NAT64/DNS64.

More details: https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-nat64-dns64.html
CAPA handles icmpv6 as a protocol number 58. AWS accepts protocol number
when creating rules. However, describing a rule from AWS API returns the
protocol name, thus causing CAPA to not recognize it and fail.
…ices

For IPv4, we have field NodePortIngressRuleCidrBlocks that specifies the
allowed source IPv4 CIDR for node NodePort services on port 30000-32767.

This extends that field to also accept IPv6 source CIDRs.
We need an option to configure IPv6 source CIDRs for SSH ingress rule of
the bastion host.

This extends the field allowedCIDRBlocks to also accepts IPv6 CIDR blocks.
When creating a bastion host for an IPv6 cluster, the instance has both
public IPv4 and IPv6. Thus, we need to report them in the cluster status
if any.

This also adds an additional print column to display that bastion IPv6.
This is a minimal template set to install an IPv6-enabled cluster. Both
the controlplane and worker nodes must use nitro-based instance type
(with IPv6 support).
This combines existing docs for IPv6 EKS clusters with non-EKS ones, and
also properly register the topic page into the documentation TOC.
Validation for specified VPC and subnet CIDRs is added for early
feedback from the webhook.

There are already existing checks for bastion and nodePort CIDRs.
The following is added:
- [BYO VPC] Mention the required route when enabling DNS64.
- [BYO VPC] Mention that CAPA only utilizes the IPv6 aspect of the dual
  stack VPC.
There is a brief period where the IPv6 CIDR is not yet associated with
the subnets. Thus, when CAPA creates the default dualstack subnets, it
should wait until the IPv6 CIDR is associated before proceeding.

If not, CAPA will misinterprete the subnet as non-IPv6 and proceed its
reconcilation. The consequence is that CAPA will skip creating a route
to eigw. Route to eigw for destination "::/0" to eigw is required for EC2
instance time sync on start-up.
…ined

When AWSCluster.spec.network.vpc.ipv6 is non-nil, most handlers in CAPA
treats it as "adding" IPv6 capabilities on top of IPv4 infrastructure.
Except security group ingress rules for API LB.

This commit aligns the API LB SG handler with the rest of the code base.

These rules can be overriden in the AWSCluster LB spec to allow only
IPv6 CIDRs if needed.
The field isIpv6 is set to true if and only if the subnet has an
associated IPv6 CIDR. This means the VPC is also associated with an
IPv6 CIDR.
The field targetGroupIPType is added to the loadbalancer spec to allow
configuring ip address type of target group for API load balancers.

This field is not applicable to Classic Load Balancers (CLB).

This commit also defines a new network status field to determine the ip
type of API load balancers.
When creating subnets in a managed VPC with IPv6 enabled, automatically
assign IPv6 CIDR blocks to subnets that have isIPv6=true but no
explicit IPv6CidrBlock specified. This simplifies subnet configuration
by allowing users to enable IPv6 without manually calculating and
assigning individual subnet IPv6 CIDRs, for example, in case where VPC
IPv6 CIDR is unknown pre-provisioning and AWS will assign one during VPC
creation.

Note: This logic only applies when spec.network.vpc.ipv6 is non-nil,
subnets are managed and non-existing.
The field awscluster.spec.network.vpc.ipv6.ipamPool defines the IPAM
pool to allocate an IPv6 CIDR for the VPC. Previously, CAPA only
considers field awscluster.spec.network.vpc.ipamPool, which is used only
for VPC IPv4 CIDR allocation.

Additionally, CAPA should preserve the ipv6 spec fields, provided by the
users, for example, the ipv6 ipamPool. Previously, these spec fields
are lost during vpc reconcilation.
NAT64/DNS64 is meant to be enabled for IPv6-only subnets, in which
instances do not have IPv4 [0].

If we enable DNS64 for dualstack subnets, instances will receive both
A/AAAA records for IPv4-only services. In most cases, OS-level settings
will prefer IPv6, leading to traffic to flow via NAT gateway instead of
using IPv4 directly.

Reference

[0] https://aws.amazon.com/blogs/networking-and-content-delivery/dual-stack-architectures-for-aws-and-hybrid-networks-part-2/
Add new dualstack cluster template and documentation updates
for IPv6 and dualstack cluster configurations. Additionally, docs for
configuring API LB's target group IP type is also added.

New cluster templates and calico manifest are included for creating
dualstack clusters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add dual stack support Add IPv6 support

6 participants