Skip to content

Conversation

@danishprakash
Copy link
Contributor

Opening this pull request to continue the discussion from containers/netavark#1338

Currently podman only binds IPv4 wildcard address. But since golang net.Listen("tcp",) listens on all IPv4 and IPv6 addresses, we (intentionally) handle both 4 and 6 here. But the downside was that you could still explicitly request an IPv6 (::) wildcard along with a dual-stack("") wildcard causing netavark to leak nftable rules. This change handles both IPv4(0.0.0.0) and IPv6(::) when no address is passed, and throws EADDRINUSE if an explicit IPv4 or IPv6 is requested on top of dual-stack wildcard.

cc/ @Luap99

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: danishprakash
Once this PR has been reviewed and has the lgtm label, please assign ashley-cui for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Oct 17, 2025

f6, err := bindPort(protocol, "", port.HostPort+i, true, &sctpWarning)
if err != nil {
logrus.Warnf("Failed to bind IPv6 for port %d, continuing with IPv4 only: %v", port.HostPort+i, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logrus.Warnf("Failed to bind IPv6 for port %d, continuing with IPv4 only: %v", port.HostPort+i, err)
logrus.Warnf("failed to bind IPv6 for port %d, continuing with IPv4 only: %v", port.HostPort+i, err)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, logrus statements are generally expected to start upper case while actual errors should be lower case. Though there is nor formal policy.

}

f6, err := bindPort(protocol, "", port.HostPort+i, true, &sctpWarning)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be erred and not warn ? Some users might be expecting bind on v6 explicitly ?

@mheon
Copy link
Member

mheon commented Oct 17, 2025

I would lean towards retaining current behavior for binding ports and just changing the bind to the port to be IPv4 only - but, if we do want to do this, the Podman 6 merge window opens next week, giving us the opportunity to make breaking changes like this.

@Luap99
Copy link
Member

Luap99 commented Oct 20, 2025

I would lean towards retaining current behavior for binding ports and just changing the bind to the port to be IPv4 only - but, if we do want to do this, the Podman 6 merge window opens next week, giving us the opportunity to make breaking changes like this.

I disagree, binding v4 only is a bug as we do not correctly prevent another v6 application on the host from taking the port.
I do agree though that this is a breaking change as it could break people having these duplicated port definitions so it goes into podman 6 only.

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like to have two sockets for v6 and v4, that is just an extra cpu/memory hit. The way it should be:

  • no HostIP: bind dual stack, e.g. tcp proto
  • ipv4 HostIP (including 0.0.0.0): bind v4, e.g. tcp4
  • ipv6 HostIP (including ::): bind v6 only, e.g. tcp

Now there are several other problems with port bindings that I would love to fix for v6, not for this PR though.

  1. There is the bug of forcing 0.0.0.0 in parseSplitPort() when passing ports on the cli which is not right then and should be removed. Additionally there are many cases where 0.0.0.0 is added as hostip in inspect and other display commands such as podman port. That isn't really right and needs to be figured out.
  2. Port binding happens does not happen inside the network interface setup where it should be ideally. The problem is that doesn't work due PostConfigureNetNS and the complication that ports must be bound before we start conmon but the netns must be configured afterwards so there isn't a way to achieve this right now.
  3. Using the golang std library API to bind the ports is problematic because it calls listen() which we really should not. That way connections that do not get redirected via firewall get buffered by the kernel and as such hang, i.e. we see that when the firewall rules get flushed. It would be best to just bind the ports.

@mheon
Copy link
Member

mheon commented Oct 20, 2025

Second and third points sound like there would be benefits to offloading the port binding to Conmon - just give it a list of things to listen on, instead of passing an arbitrary number of FDs for ports

@Luap99
Copy link
Member

Luap99 commented Oct 20, 2025

Second and third points sound like there would be benefits to offloading the port binding to Conmon - just give it a list of things to listen on, instead of passing an arbitrary number of FDs for ports

That doesn't work, the network setup without a userns (PostConfigureNetNS) happens before conmon is even started but ports have to be bound before the network setup. That is what I meant point 2 is basically unfixable because of these two different order that I tried to remove int he past but doesn't work due the somewhat broken checkpoint/restore support.

@baude baude added the 6.0 Breaking changes for Podman 6.0 label Oct 21, 2025
Currently podman only binds IPv4 wildcard address. But since
golang net.Listen("tcp",) listens on all IPv4 and IPv6 addresses,
we (intentionally) handle both 4 and 6 here. But the downside was that you
could still explicitly request an IPv6 (::) wildcard along with a
dual-stack("") wildcard causing netavark to leak nftable rules.
This change handles both IPv4(0.0.0.0) and IPv6(::) when no address is
passed, and throws EADDRINUSE if an explicit IPv4 or IPv6 is requested
on top of dual-stack wildcard.

Signed-off-by: Danish Prakash <contact@danishpraka.sh>
@danishprakash
Copy link
Contributor Author

  • no HostIP: bind dual stack, e.g. tcp proto
  • ipv4 HostIP (including 0.0.0.0): bind v4, e.g. tcp4
  • ipv6 HostIP (including ::): bind v6 only, e.g. tcp

By default, podman will now bind dual-stack unless or until an IP is specified; in that case, podman parses and determines the protocol (as was the case before) and uses either [tcp|udp]4 or 6. The problem with 0.0.0.0 still lies, perhaps can be discussed in another PR given the impact on compat that would have.


Though netavark currently fails to properly set up nftable rules with rootless containers:

$ podman run -d --name netavark-test-container -it --network test_net -p 8080:8080 -p '[::]:8080:8080' alpine /bin/sh -c 'sleep 2'
internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

internal:0:0-0: Error: Could not process rule: No such file or directory

WARN[0003] failed to teardown network after failed setup: netavark: nftables error: "nft" did not return successfully while applying ruleset
Error: rootlessport listen tcp 0.0.0.0:8080: bind: address already in use

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.0 Breaking changes for Podman 6.0 do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants