Apache Kafka Sink Only Connect can stream messages from Apache Kafka to Google Cloud Platform (GCP) wide column store Bigtable.
Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation and written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for real-time data feeds. Please look at Apache Kafka home page.
Bigtable is a compressed, high performance, proprietary data storage system built on Google File System, Chubby Lock Service, SSTable and a few other Google technologies. On May 6, 2015, a public version of Bigtable was made available as a service in the Google Cloud Platform. For more details, please refer to GCP Bigtable home page.
This project leverages bigtable-client-core library (NO HBase) to stream data to GCP Bigtable. bigtable-client-core internally use the gRPC framework to talk to GCP Bigtable.
You have Apache ZooKeeper and Apache Kafka installed and running on your computer. Please refer to the respective sites to download and start ZooKeeper and Kafka. You also need Java version 11 or above.
Software | Version | Note |
---|---|---|
Java | 11 | Tested using Java 11. |
Kafka | 3.3.1 | Please refer. Tested using kafka_2.13-3.3.1, should work with older versions. |
bigtable-client-core | 1.27.1 | Please refer. |
Kafka connect-api | 3.3.1 | Please refer. |
grpc-netty-shaded | 1.51.0 | Please refer. |
Please refer to project Wiki
The current configuration system supports streaming messages from a given topic to a table. You can subscribe to any number of topics, but a topic can be pointed to one and only one table. Say, for example, if you subscribed from a topic named demo-topic, you should have a yml file named demo-topic.yml. That yml file contains all the configuration required to transform and write data into Bigtable.
Please refer to project Wiki
Please refer to project Wiki
Please refer to project Wiki
Either create issues in this project or send it to bt@sanju.org. Thanks!