-
Notifications
You must be signed in to change notification settings - Fork 876
Client
This question is not as straight forward as it sounds. What does 'cluster down' mean? Is that all brokers down? Or just the leaders for the partitions we care about? Does it include the replicas? If all brokers are down is this maybe just a configuration error on the client, or a temporary networking problem?
If we propagate broker state information via the client, should we then make partition leader information available, and maybe consumer coordinator information? Should we provide everything you need to re-implement the capability the client already provides?
We take the approach that as a user you shouldn't care. You configure the message.timeout.ms
and message.max.retries
settings and let the client take care of the rest. At the end of the day, it typically boils down to the question: what amount of time will I allow for a message to be sent before it is deemed outdated?
Yep, this is a valid ask - you may commonly want to take different action on application startup than if the cluster becomes unavailable later. For example, it's common for mis-configurations to occur, and you might want to detect that happening.
At the moment this is a bit awkward to test for. You'll need to listen on the error handler for all-brokers down error and have logic to determine how soon this occurred after application startup.
What factors determine the connection count? (#brokers, #topics, #partitions, #client consumer instances, other?)
Refer to: https://github.com/edenhill/librdkafka/wiki/FAQ#number-of-broker-tcp-connections
The number of open connections is determined by the number of brokers. The client writes / reads data directly from the broker that is the leader for the partition of interest and commonly a client will require connections to all brokers.
The worst case number of connections held open by librdkafka is: cnt(bootstrap.servers) + cnt(brokers in Metadata response)
. The minimum number of connections held open is cnt(brokers in Metadata response)
. Currently, librdkafka holds connections open to all brokers whether or not they are needed. In the future, we plan to enhance librdkafka so that disused connections are not maintained.
You should generally avoid creating multiple Consumer or Producer instances in your process if possible:
- Each client will maintain a connection with all the brokers in the cluster.
- If you create multiple clients, data won't be batched as efficiently as if you had just one.
- All operation on client are thread safe (with some small exceptions documented in the API).
You can create multiple clients if you want. If you do, they won't share any data between them.
You won't gain throughput by creating multiple clients in a single process to dispatch process - you should rather subscribe to all your topics in one consumer, and dispatch processing of message to worker tasks (TODO: examples and benchmark of those solutions).
You should consider deploying your application on multiple machines - this way, you are more reliable and bandwidth between your Kafka cluster and the worker machine won't be the bottleneck. This is the way to do with kafka to process data from multiple partitions.