2 Node K8S , Metallb, pihole service externalTrafficPolicy #233
Replies: 3 comments 1 reply
-
I just wanted to backup @quotidian-ennui's experience with my own. I came across this after fighting with Pihole for a couple of days with my new K3S installation. I'm new to Kubernetes, and am still very much in the early learning phase. My setup is a 3-node K3s cluster running on two RPis and an Intel NUC. I'm using MetalLB for load-balancing, Traefik for ingress and Longhorn for shared storage (I think those are the right terms). I used Mojo2600's Helm chart for deploying Pihole (of course, or else why would I be here?). All my HTTPS traffic for the UIs for Traefik, Portainer , Longhorn and Pihole are routing to their respective services as expected. However, Pihole would only respond to DNS on certain machines. After much digging and reading, I think I've figured out why it was so intermittent. When a device queries for an IP address that is used by MetalLB for loadbalancing, any one of the nodes will respond with the MAC address of that node's network card via ARP. This node will then be used by that device for all communications. With Pihole, it would only respond to DNS queries if the selected node happened to be hosting the Pihole pod. I verified by doing the following (did this from Windows):
If the MAC address for the LB IP matched up with the host currently running the Pihole pod, then things worked fine. If not, then the test failed. I then ran arp -d to clear out the arp cache. After a few times, I would get a different MAC address and would repeat the same test. Once I set externalTrafficPolicy=cluster, then Pihole worked in all combinations. According to https://metallb.universe.tf/usage/, setting externalTrafficPolicy to local prevents MetalLB from forwarding requests outside the node.
So, it appears we have an issue here. Using externalTrafficPolicy=cluster means that everything appears to be coming from one IP address, which prevents setting rules per client, but setting externalTrafficPolicy=local breaks DNS when the Pihole pod isn't on the node that answered the MetalLB ARP request. Anybody have an idea how to reconcile this? Maybe I'm missing something fundamental here about how to configure MetalLB or Pihole. |
Beta Was this translation helpful? Give feedback.
-
Bumping this topic. I have deployed a new cluster, just to test traefik. After a lof of reading, I see I have this "issue", but with some differences
Any ideas? |
Beta Was this translation helpful? Give feedback.
-
Simple: The fix when using L2 with MetalLB is to run pihole with enough replicas to have 1 on each node. Then set them with topology constraints or anti affinity to make sure 1 lands on each node. This way whichever node is the MetalLB speaker will have a pihole to respond. The downside is you won't be able to see your logs easily without a sidecar to forward them to a log aggregation platform as your ingress may drop you onto an inactive pod. Advanced: Or use BGP with MetalLB and the node with the pod will advertise the IP assigned to the service for that pod. If you have more than 1 pod on more than 1 node, each node with a pod will advertise it. |
Beta Was this translation helpful? Give feedback.
-
I'd like to ask a question as to whether I'm doing the right thing or even if the right thing matters in this instance, provided it works. I couldn't find any similar discussions using externalTrafficPolicy as the search term so here we are. I'm using the helm chart version pihole-2.9.0 / appVersion = 2022.05
My confusion here lies in the default values for
externalTrafficPolicy
; Local has the side-effect of not losing the source IP. Cluster seems more correct because we're already trying to run in kubernetes (the implication of that being, you're going to end up with >1 node at some point).I have a 2 node K8S cluster, metallb running in L2 mode, nothing actually edgy. I had the problem where the pihole would deploy onto node2, but no DNS traffic would be routed to it. I'm using nginx as the ingress controller.
In almost all situations node1 is the metallb leader; node2 is not.
This is the configuration that I ended up with, in order to make the pihole deploy well enough to work in my setup on either node.
As far as I can work out, metallb running in L2 mode means that the leader gets all the traffic for 192.168.0.99 and then it has to decide where it goes; which according to my limited brain goes something like this.
if externalTrafficPolicy==Local && pihole is deployed on
node1
. Metallb leader gets the request and does know what to do with it, so nslookup requests work from the client machines.if externalTrafficPolicy==Local && pihole is deployed on
node2
. Metallb leader gets the request and doesn't know what to do with it, so nslookup requests timeout.if externalTrafficPolicy == Cluster and you're running with 2 separate services (i.e. mixedService == false) then you get a few things stuck in a pending state when you do
kubectl get svc -A
Consequences
Beta Was this translation helpful? Give feedback.
All reactions