-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad 1.7.6 does not pick up Docker DNS settings #20174
Comments
Thank you for the detailed report @Jess3Jane and thank you @apollo13 for the git log spelunking. I was able to verify that reverting that commit does fix the problem. I suspect we need to guard the DNS config override to explicit nomad/client/allocrunner/taskrunner/task_runner.go Lines 1137 to 1145 in 23e4b7c
I've placed the issue for further roadmapping. |
Yeah I do not think we can solely do this for CNI/ networks. This is needed for the default bridge with transparent proxy as well because then the consul-k8s plugin will provide a DNS server.
…On Thu, Mar 21, 2024, at 23:23, Luiz Aoqui wrote:
Thank you for the report @Jess3Jane <https://github.com/Jess3Jane> and
thank you @apollo13 <https://github.com/apollo13> for the git log
spelunking. I was able to verify that reverting that commit does fix
the problem.
I suspect we need to guard the DNS config override to explicit `cni/`
networks, so the `bridge` network is not affected, but I'm not sure if
this would also revert the intended fix.
https://github.com/hashicorp/nomad/blob/23e4b7c9d23350f9d3bd2707b0d79f413767c438/client/allocrunner/taskrunner/task_runner.go#L1137-L1145
I've placed the issue for further roadmapping.
—
Reply to this email directly, view it on GitHub
<#20174 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAT5C5JDED5BUSM45NMUALYZNMUNAVCNFSM6AAAAABFALGUB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJTHE2TIMJVGE>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi folks! This is certainly a regression caused by the bug fix I did in #20007. I had a comment in that PR:
It turns out there are four! At first glance prioritizing the four different sources would be challenging because of the work I'm working on for transparent proxy (as @apollo13 noted). But a quick check of I'll do some testing of this and likely get a PR up later today if that proves to be the correct hypothesis. Thanks again @Jess3Jane, @apollo13, and @lgfa29! Edit: bah that's two different structs, as we convert from a non-pointer CNI DNS result to a Nomad-internal struct pointer, so that's not the problem exactly but I'm sure it's something along those lines. Still investigating. |
Ok, I've got it. The Fix should be easy, just working up some unit tests to make sure the behavior is properly exercised as well. |
In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174
Fixed in #20189 |
In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174
In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174
In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174
This fix will get shipped in the next release of Nomad, as well as backported to 1.6.x and 1.5.x |
…rd into release/1.6.x (#20192) In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174 Co-authored-by: Tim Gross <tgross@hashicorp.com>
…rd into release/1.5.x (#20191) In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174 Co-authored-by: Tim Gross <tgross@hashicorp.com>
In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: #20174
Nomad version
Version in which this functionality is broken:
root@client-1:/# nomad version Nomad v1.7.6 BuildDate 2024-03-12T07:27:36Z Revision 594fedbfbc4f0e532b65e8a69b28ff9403eb822e
Version in which this functionality is working (I tested down to 1.7.2 and they all behave the same):
root@client-1:/# nomad version Nomad v1.7.5 BuildDate 2024-02-13T15:10:13Z Revision 5f5d4646198d09b8f4f6cb90fb5d50b53fa328b8
Operating system and Environment details
This is a completely fresh digital ocean node (I will note all of the changes I made below). Of note we have also hit this issue on systems of varying other configurations (all Ubuntu 22.04 x64).
Issue
Docker allows you to configure a set of DNS servers to give to each container. In Nomad 1.7.5 if you did not set the DNS settings for a job using the docker driver the containers in that job would have this DNS configuration. In Nomad 1.7.6 the containers instead have the host's default DNS configuration which is often undesirable.
Reproduction steps
https://github.com/containernetworking/plugins/releases/download/v1.0.0/cni-plugins-linux-amd64-v1.0.0.tgz
into/opt/cni/bin
/etc/systemd/system/docker.service.d/overrides.conf
:systemctl daemon-reload
systemctl start docker
systemctl start nomad
nomad job run <job-file.hcl>
Expected Result
If we exec into our task on Nomad 1.7.5 we can observe the correct value of
resolv.conf
Compare this to the result of running a container with docker directly to see that they match:
Actual Result
If you do exactly the same with Nomad
1.7.6
you instead will find that the job has a differentresolv.conf
:The exact values will likely differ on your system but we can confirm that this is the contents of
/run/systemd/resolv/resolv.conf
:Job file (if appropriate)
This happens with all jobs that don't configure any dns settings but the specific job I've used for testing is:
The text was updated successfully, but these errors were encountered: