kafka.Consumer.Pause([]TopicPartition) hangs - rd_kafka_toppars_pause_resume - rd_kafka_q_wait_result(tmpq, RD_POLL_INFINITE) be the cause? #4705
alexseel-a3949
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Reluctant to raise a defect given the lack of others seemingly having this issue. We had this in production where when we paused the kafka consumer it hung - and we had to kill a live service.
looking at LIBRDKAFKA C code rdkafka_partition.c, line 2382 function rd_kafka_toppars_pause_resume looks suspicious as when run synchronously (which it is from the higher level calls) we seem to poll partitions and wait forever for a response.
This fits with what we see.
Will be doing more reproduction tracing with Debug:All enabled to dump the full kafka logs but would like a recommendation for how to Pause the assigned topic partitions robustly even if there is a problem with the topic/brokers/replicas such that this function hangs forever at line 2431:
if (!async) {
while (waitcnt-- > 0)
rd_kafka_q_wait_result(tmpq, RD_POLL_INFINITE);
Would be great if anyone from the community or Confluent team could advise best practice here.
Thanks,
Alex
Beta Was this translation helpful? Give feedback.
All reactions