[Bug]: amd build failing due to DeadlineExceeded #203

uptownhr · 2025-01-15T15:14:11Z

Prior Search

I have already searched this project's issues to determine if a bug report has already been made.

What happened?

I am utilizing the wf_dockerfile_build to build an image however the amd64 step is constantly failing due to the error

error: listing workers for Build: failed to list workers: DeadlineExceeded: logical service 10.0.102.169:1234: route default.endpoint: backend default.unknown: service in fail-fast

Steps to Reproduce

Unknown

Relevant log output

full logs from workflow step

{"argo":true,"level":"info","msg":"waiting for dependency \"scale-buildkit\"","time":"2025-01-15T14:58:42.586Z"}
{"argo":true,"level":"info","msg":"waiting for dependency \"clone\"","time":"2025-01-15T14:58:43.586Z"}
{"argo":true,"level":"info","msg":"capturing logs","time":"2025-01-15T14:58:44.586Z"}
error: listing workers for Build: failed to list workers: DeadlineExceeded: logical service 10.0.102.169:1234: route default.endpoint: backend default.unknown: service in fail-fast
{"argo":true,"error":null,"level":"info","msg":"sub-process exited","time":"2025-01-15T14:58:49.593Z"}
{"argo":true,"level":"info","msg":"not saving outputs - not main container","time":"2025-01-15T14:58:49.593Z"}
Error: exit status 1

The text was updated successfully, but these errors were encountered:

fullykubed · 2025-01-15T22:49:58Z

To me, this indicates that there is an issue with Cilium launching on new amd64 nodes.

My guess is that b/c we use the VPA to set resource requests / limits for the cilium node agent AND all nodes in the cluster are usually arm64, that the VPA sets inappropriate resources (too low / oom) for cilium when launching the first amd64 node in the cluster. This will eventually resolve itself as the VPA adjusts the resource recommendations upwards.

I will have to dig into this more to propose a solution. I am not yet sure how to get the VPA to make different recommendations based on the CPU architecture.

uptownhr added bug Something isn't working triage Needs to be triaged labels Jan 15, 2025

uptownhr assigned fullykubed Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: amd build failing due to DeadlineExceeded #203

[Bug]: amd build failing due to DeadlineExceeded #203

uptownhr commented Jan 15, 2025

fullykubed commented Jan 15, 2025

[Bug]: amd build failing due to DeadlineExceeded #203

[Bug]: amd build failing due to DeadlineExceeded #203

Comments

uptownhr commented Jan 15, 2025

Prior Search

What happened?

Steps to Reproduce

Relevant log output

fullykubed commented Jan 15, 2025