Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running maxwell on kubernetes #2036

Open
KURANADO2 opened this issue Aug 22, 2023 · 3 comments
Open

Running maxwell on kubernetes #2036

KURANADO2 opened this issue Aug 22, 2023 · 3 comments

Comments

@KURANADO2
Copy link

KURANADO2 commented Aug 22, 2023

I want running maxwell on kubernets.

Below is my raft.xml file:

<?xml version='1.0' encoding='utf-8'?>
<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <UDP mcast_addr="228.8.8.8" mcast_port="${jgroups.udp.mcast_port:45588}"/>
    <PING />
    <MERGE3 />
    <FD_SOCK/>
    <FD_ALL/>
    <VERIFY_SUSPECT timeout="1500"/>
    <pbcast.NAKACK2 xmit_interval="500"/>
    <UNICAST3 xmit_interval="500"/>
    <pbcast.STABLE desired_avg_gossip="50000" max_bytes="4M"/>
    <raft.NO_DUPES/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"/>
    <UFC max_credits="2M" min_threshold="0.4"/>
    <MFC max_credits="2M" min_threshold="0.4"/>
    <FRAG2 frag_size="60K"/>
    <raft.ELECTION election_min_interval="500" election_max_interval="1000" heartbeat_interval="250"/>
    <raft.RAFT members="A,B,C" raft_id="${raft_id:undefined}"/>
    <raft.REDIRECT/>
</config>

Below is my kubernetes deploy yaml file: maxwell.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: maxwell
  namespace: abup
#  annotations:
#    [nginx.ingress.kubernetes.io/affinity](http://nginx.ingress.kubernetes.io/affinity): "true"
#    [nginx.ingress.kubernetes.io/session-cookie-name](http://nginx.ingress.kubernetes.io/session-cookie-name): backend
#    [nginx.ingress.kubernetes.io/load-balancer-method](http://nginx.ingress.kubernetes.io/load-balancer-method): drr
spec:
  type: NodePort
  sessionAffinity: ClientIP
  selector:
    app: maxwell
  ports:
  - name: web
    port: 7800
    targetPort: 7800
    nodePort: 30111
---
apiVersion: v1
kind: Service
metadata:
  name: maxwell-headless
  namespace: abup
  labels:
    app: maxwell

spec:
  ports:
    - port: 7800
      name: server
      targetPort: 7800
  clusterIP: None
  selector:
    app: maxwell

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
#  labels:
#    k8s-app: maxwell
#    qcloud-app: maxwell
  name: maxwell
  namespace: abup
spec:
  serviceName: maxwell-headless
  replicas: 3
  selector:
    matchLabels:
      k8s-app: maxwell
      qcloud-app: maxwell
#  strategy:
#    rollingUpdate:
#      maxSurge: 1
#      maxUnavailable: 0
#    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: maxwell
        qcloud-app: maxwell
      annotations:
        [pod.alpha.kubernetes.io/initialized](http://pod.alpha.kubernetes.io/initialized): "true"
    spec:
      affinity:
       nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution: 
          nodeSelectorTerms:
          - matchExpressions:
            - key: [kubernetes.io/hostname](http://kubernetes.io/hostname)
              operator: In
              values:
              - [10.19.10.13](http://10.19.10.13/)
              - [10.19.10.2](http://10.19.10.2/)
              - [10.19.10.7](http://10.19.10.7/)

      containers:
      - env:
        - name: JAVA_OPTS
          value: -Djava.net.preferIPv4Stack=true
        - name: jgroup.tcpping.initial_hosts
          value: [10.19.10.13](http://10.19.10.13/)[7800],[10.19.10.2](http://10.19.10.2/)[7800],[10.19.10.7](http://10.19.10.7/)[7800]
        image: [abup-registry-test.tencentcloudcr.com/abup-qa-gq-ota/maxwell:v1.37.521](http://abup-registry-test.tencentcloudcr.com/abup-qa-gq-ota/maxwell:v1.37.521)
        imagePullPolicy: Always
        name: maxwell
        command: ["/bin/bash","-c"]
        args: ["/app/maxwell.sh"]
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 1
            memory: 2Gi
        volumeMounts:
        - mountPath: /etc/maxwell/config.properties
          subPath: config.properties
          name: maxwell-conf
      #  volumeMounts:
      #  - mountPath: /app/raft.xml
      #    name: raft-xml
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: docker-login
      restartPolicy: Always
      volumes:
      - name: maxwell-conf
        configMap:
          name: maxwell-conf
          defaultMode: 0777
    #  volumes:
    #  - name: raft-xml
    #    hostPath:
    #      path: /root/raft.xml

Content of the maxwell.sh script in the yaml file:

#!/bin/bash
hostname=$(cat /etc/hostname)
if [ $hostname = maxwell-0 ]
   then
     ./bin/maxwell --producer_partition_by=table --config /etc/maxwell/config.properties --ha --raft_member_id=A
   elif [ $hostname = maxwell-1 ]
        then
          ./bin/maxwell --producer_partition_by=table --config /etc/maxwell/config.properties --ha --raft_member_id=B
   elif [ $hostname = maxwell-2 ]
        then
          ./bin/maxwell --producer_partition_by=table --config /etc/maxwell/config.properties --ha --raft_member_id=C
   else
        echo "No Matched" > test.txt
fi

When i execute command: kubectl apply -f maxwell.yaml, The logs for the three Pods are as follows:

image

What should I do to get the three nodes to start an election?

@osheroff
Copy link
Collaborator

Honestly I think that running maxwell on kubernetes PLUS raft feels overkill -- k8s already has a great mechanism for running one and only one copy of a service like maxwell at time, plus has support for restarting it when it's down, or when a node dies, etc.

If you decide you really really need to do that i think you have to configure jgroups-raft to talk via TCP instead of its normal multicast thing but it's out of my expertise; I'd check over there.

@yalattas
Copy link

yalattas commented Mar 21, 2024

I am having same issue

2024-03-21 13:00:53 INFO  Maxwell - Starting Maxwell. maxMemory: 8068464640 bufferMemoryUsage: 0.25
2024-03-21 13:00:53 DEBUG Configurator - set property TCP.bind_addr to default value /10.244.0.88
2024-03-21 13:00:53 DEBUG Configurator - set property TCP.diagnostics_addr to default value /224.0.75.75
2024-03-21 13:00:53 DEBUG TCP - thread pool min/max/keep-alive: 0/100/30000, internal pool: 0/4/30000 (1 cores available)
2024-03-21 13:00:53 DEBUG NAKACK2 - JGRP000037: use_mcast_xmit should not be used because the transport (TCP) does not support IP multicasting; setting use_mcast_xmit to false
2024-03-21 13:00:53 INFO  JChannel - local_addr: maxwell-0, name: maxwell-0-61765
2024-03-21 13:00:53 DEBUG LevelDBLog - Initializing log with empty Metadata

-------------------------------------------------------------------
GMS: address=maxwell-0, cluster=maxwell-0, physical address=10.244.0.88:7500
-------------------------------------------------------------------
2024-03-21 13:00:55 INFO  GMS - maxwell-0: no members discovered after 2002 ms: creating cluster as coordinator
2024-03-21 13:00:55 DEBUG NAKACK2 - 
[maxwell-0 setDigest()]
existing digest:  ]
new digest:       maxwell-0: [0 (0)]
resulting digest: maxwell-0: [0 (0)]
2024-03-21 13:00:55 DEBUG GMS - maxwell-0: installing view [maxwell-0|0] (1) [maxwell-0] (maxwell-0 joined)
2024-03-21 13:00:55 DEBUG STABLE - resuming message garbage collection
2024-03-21 13:00:55 DEBUG GMS - maxwell-0: created cluster (first member). My view is [maxwell-0|0], impl is CoordGmsImpl
2024-03-21 13:00:55 INFO  MaxwellHA - enter HA group, current leader: null
2024-03-21 13:00:57 INFO  MaxwellHA - lost HA election, current leader: null
2024-03-21 13:01:35 WARN  TCP - JGRP000012: discarded message from different cluster maxwell-1 (our cluster is maxwell-0). Sender was maxwell-1
2024-03-21 13:01:36 WARN  TCP - JGRP000012: discarded message from different cluster maxwell-2 (our cluster is maxwell-0). Sender was maxwell-2
2024-03-21 13:02:38 WARN  TCP - JGRP000012: discarded message from different cluster maxwell-1 (our cluster is maxwell-0). Sender was maxwell-1 (received 7 identical messages from maxwell-1 in the last 63314 ms)

each pod is creating its own cluster

its working in k8s

?xml version='1.0' encoding='utf-8'?>
      <config xmlns="urn:org:jgroups"
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
          <TCP  bind_port="7500" />
          <TCPPING async_discovery="true"
            initial_hosts="${jgroups.tcpping.initial_hosts:maxwell-headless[7500]}"
            port_range="3"/>
          <PING />
          <MERGE3 />
          <FD_SOCK/>
          <FD_ALL/>
          <VERIFY_SUSPECT timeout="1500"/>
          <pbcast.NAKACK2 xmit_interval="500"/>
          <UNICAST3 xmit_interval="500"/>
          <pbcast.STABLE desired_avg_gossip="50000" max_bytes="4M"/>
          <raft.NO_DUPES/>
          <pbcast.GMS print_local_addr="true" join_timeout="2000"/>
          <UFC max_credits="2M" min_threshold="0.4"/>
          <MFC max_credits="2M" min_threshold="0.4"/>
          <FRAG2 frag_size="60K"/>
          <raft.ELECTION election_min_interval="500" election_max_interval="1000" heartbeat_interval="250"/>
          <raft.RAFT members="maxwell-0,maxwell-1,maxwell-2" raft_id="${raft_id:undefined}"/>
          <raft.REDIRECT/>
      </config>

Also, faced an issue with TCPPING

https://issues.redhat.com/browse/AS7-4828

https://bugzilla.redhat.com/show_bug.cgi?id=900707

2024-03-21 13:07:16 INFO  Maxwell - Starting Maxwell. maxMemory: 8068464640 bufferMemoryUsage: 0.25
java.lang.Exception: Property assignment of initial_hosts in TCPPING with original property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] and converted to null could not be assigned
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:818)
    at org.jgroups.stack.Configurator.initializeAttrs(Configurator.java:212)
    at org.jgroups.stack.Configurator.createProtocolsAndInitializeAttrs(Configurator.java:126)
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:65)
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:49)
    at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:490)
    at org.jgroups.JChannel.init(JChannel.java:922)
    at org.jgroups.JChannel.<init>(JChannel.java:123)
    at org.jgroups.JChannel.<init>(JChannel.java:105)
    at com.zendesk.maxwell.MaxwellHA.startHA(MaxwellHA.java:57)
    at com.zendesk.maxwell.Maxwell.main(Maxwell.java:335)
Caused by: java.lang.Exception: Conversion of initial_hosts in TCPPING with property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] failed
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:85)
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812)
    ... 10 more
Caused by: java.net.UnknownHostException: maxwell-1.maxwell: Name or service not known
    at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929)
    at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529)
    at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848)
    at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1519)
    at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1378)
    at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
    at org.jgroups.util.Util.parseCommaDelimitedHosts(Util.java:3583)
    at org.jgroups.conf.PropertyConverters$InitialHosts.convert(PropertyConverters.java:49)
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:82)
    ... 11 more
2024-03-21 13:07:16 ERROR Maxwell - Maxwell saw an exception and is exiting...
java.lang.Exception: Property assignment of initial_hosts in TCPPING with original property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] and converted to null could not be assigned
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:818) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.initializeAttrs(Configurator.java:212) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.createProtocolsAndInitializeAttrs(Configurator.java:126) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:65) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.setupProtocolStack(Configurator.java:49) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:490) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.JChannel.init(JChannel.java:922) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.JChannel.<init>(JChannel.java:123) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.JChannel.<init>(JChannel.java:105) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at com.zendesk.maxwell.MaxwellHA.startHA(MaxwellHA.java:57) ~[maxwell-1.41.0.jar:1.41.0]
    at com.zendesk.maxwell.Maxwell.main(Maxwell.java:335) [maxwell-1.41.0.jar:1.41.0]
Caused by: java.lang.Exception: Conversion of initial_hosts in TCPPING with property value maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500] failed
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:85) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    ... 10 more
Caused by: java.net.UnknownHostException: maxwell-1.maxwell: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) ~[?:?]
    at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929) ~[?:?]
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529) ~[?:?]
    at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848) ~[?:?]
    at java.net.InetAddress.getAllByName0(InetAddress.java:1519) ~[?:?]
    at java.net.InetAddress.getAllByName(InetAddress.java:1378) ~[?:?]
    at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
    at org.jgroups.util.Util.parseCommaDelimitedHosts(Util.java:3583) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.conf.PropertyConverters$InitialHosts.convert(PropertyConverters.java:49) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.conf.PropertyHelper.getConvertedValue(PropertyHelper.java:82) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    at org.jgroups.stack.Configurator.resolveAndAssignField(Configurator.java:812) ~[jgroups-5.1.2.Final.jar:5.1.2.Final]
    ... 10 more
2024-03-21 13:07:16 INFO  TaskManager - Stopping 0 tasks
2024-03-21 13:07:16 INFO  TaskManager - Stopped all tasks
2024-03-21 13:07:16 DEBUG MaxwellContext - Shutdown complete: true
Stream closed EOF for maxwell/maxwell-0 (maxwell)

in case I have RAFT configured with port enabled:

<?xml version='1.0' encoding='utf-8'?>
    <config xmlns="urn:org:groups"
...
        <TCP  bind_port="7500" />
        <TCPPING async_discovery="true"
          initial_hosts="${jgroups.tcpping.initial_hosts:maxwell-0.maxwell[7500],maxwell-1.maxwell[7500],maxwell-2.maxwell[7500]}"
          port_range="3"/>
...
    </config>

k8s config

spec:
  containers:
  - name: maxwell
    image: zendesk/maxwell:v1.41.0
    imagePullPolicy: IfNotPresent
    command:
    - bin/maxwell
    args:
    - "--env_config_prefix=MW_"
    - "--ha"
    - "--raft_member_id=$(POD_NAME)"
    - "--client_id=$(POD_NAME)"
    env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name

@osheroff
Copy link
Collaborator

guys,

you really don't need k8s + raft. There's really no need; let k8s run "1 and exactly 1" copy of maxwell ; if one dies k8s will replace it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants