Skip to content

Conversation

@tiagotxm
Copy link
Contributor

@tiagotxm tiagotxm commented Nov 3, 2025

When the driver pod IP is an IPv6 address, Spark expects the address to be enclosed in square brackets (e.g., [2001:db8::1]). The previous approach set spark.driver.host unconditionally via:

args = append(args, "--conf", "spark.driver.host=${POD_IP}")

This fails Spark's strict hostname checks for IPv6 addresses.

Purpose of this PR

Proposed changes:

Fix spark.driver.host to Enclose IPv6 Addresses in Brackets to Prevent AssertionError (#2679)

Change Category

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

Checklist

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

  • Fixes the AssertionError on IPv6 clusters by ensuring Spark receives a valid host identifier.
  • Compatible with both IPv4 and IPv6 environments.
image

@tiagotxm
Copy link
Contributor Author

tiagotxm commented Nov 3, 2025

Hi maintainers,

I'm working on a fix for #2679 to properly enclose IPv6 addresses in brackets for spark.driver.host
I noticed that in pkg/util/util.go, the GetMasterURL() function already handles the IPv6 bracket requirement with logic like:

if strings.Contains(kubernetesServiceHost, ":") && !strings.HasPrefix(kubernetesServiceHost, "[") {
    return fmt.Sprintf("k8s://https://[%s]:%s", kubernetesServiceHost, kubernetesServicePort), nil
}

Would it make sense to generalize or reuse this condition for setting values such as spark.driver.host, to ensure consistent IPv6 handling across the operator codebase?

Any feedback or suggestions would be appreciated!

Thanks!

@tiagotxm tiagotxm changed the title Fix driver host configuration to handle IP addresses with brackets in… Fix driver host configuration to handle IPv6 addresses Nov 3, 2025
@google-oss-prow
Copy link
Contributor

@dbbvitor: changing LGTM is restricted to collaborators

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ChenYi015
Copy link
Member

Would it make sense to generalize or reuse this condition for setting values such as spark.driver.host, to ensure consistent IPv6 handling across the operator codebase?

I am afraid not. Since we cannot determine the server pod IP before creating it, the operator cannot construct the final paramater --conf spark.driver.host=${POD_IP} in advance.

@dbbvitor
Copy link

dbbvitor commented Nov 4, 2025

Would it make sense to generalize or reuse this condition for setting values such as spark.driver.host, to ensure consistent IPv6 handling across the operator codebase?

I am afraid not. Since we cannot determine the server pod IP before creating it, the operator cannot construct the final paramater --conf spark.driver.host=${POD_IP} in advance.

I think he is suggesting something like having the dynamic bracket wrapping logic in go instead of bash.

Maybe the snippet he quoted could be its own function so it could be used standalone or inside the GetMasterURL() function, avoiding duplication.

Maybe something like:

func driverConfOption(...) {
//...

    podIp, err := util.dynamicIPv6Brackets(POD_IP)
    if err != nil {
        return nil, fmt.Errorf("failed to proccess IPv6 address: %v", err)
    }
    args = append(args, "--conf",
        fmt.Sprintf("spark.driver.host=%s", podIp)
    )

}

@ChenYi015
Copy link
Member

It is good to create a util function to bracket IP address as needed. But this function cannot be reused in the Spark connect case because the code must be executed in server pod, not in operator itself.

… Spark Connect

Signed-off-by: tiago.matos <tiago.matos@gympass.com>
Signed-off-by: tiago.matos <tiago.matos@gympass.com>
@tiagotxm tiagotxm marked this pull request as ready for review November 7, 2025 15:01
Copy link
Member

@ChenYi015 ChenYi015 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiagotxm Thanks for fixing the IPv6 issue.
/approve

@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ChenYi015

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit d49d47f into kubeflow:master Nov 10, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Spark Connect] AssertionError: Expected hostname or IPv6 IP enclosed in []

3 participants