You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have already searched this project's issues to determine if a bug report has already been made.
What happened?
I am utilizing the wf_dockerfile_build to build an image however the amd64 step is constantly failing due to the error
error: listing workers for Build: failed to list workers: DeadlineExceeded: logical service 10.0.102.169:1234: route default.endpoint: backend default.unknown: service in fail-fast
Steps to Reproduce
Unknown
Relevant log output
full logs from workflow step
{"argo":true,"level":"info","msg":"waiting for dependency \"scale-buildkit\"","time":"2025-01-15T14:58:42.586Z"}
{"argo":true,"level":"info","msg":"waiting for dependency \"clone\"","time":"2025-01-15T14:58:43.586Z"}
{"argo":true,"level":"info","msg":"capturing logs","time":"2025-01-15T14:58:44.586Z"}
error: listing workers forBuild: failed to list workers: DeadlineExceeded: logical service 10.0.102.169:1234: route default.endpoint: backend default.unknown: servicein fail-fast
{"argo":true,"error":null,"level":"info","msg":"sub-process exited","time":"2025-01-15T14:58:49.593Z"}
{"argo":true,"level":"info","msg":"not saving outputs - not main container","time":"2025-01-15T14:58:49.593Z"}
Error: exit status 1
The text was updated successfully, but these errors were encountered:
To me, this indicates that there is an issue with Cilium launching on new amd64 nodes.
My guess is that b/c we use the VPA to set resource requests / limits for the cilium node agent AND all nodes in the cluster are usually arm64, that the VPA sets inappropriate resources (too low / oom) for cilium when launching the first amd64 node in the cluster. This will eventually resolve itself as the VPA adjusts the resource recommendations upwards.
I will have to dig into this more to propose a solution. I am not yet sure how to get the VPA to make different recommendations based on the CPU architecture.
Prior Search
What happened?
I am utilizing the
wf_dockerfile_build
to build an image however theamd64
step is constantly failing due to the errorSteps to Reproduce
Unknown
Relevant log output
The text was updated successfully, but these errors were encountered: