This repository holds some common gearpump usage patterns with Java.
The examples include:
- word count -- simple Java app that shows the structure of gearpump app
- Kafka -> Kafka pipeline -- very simple example that shows how to read and write from Kafka topics
- Kafka -> HBase pipeline -- how to read from Kafka topic, how to write to HBase
The following sections will give you information about:
- How to build and run examples
- How specific example works
The repository is organized in one maven project that contains all the examples.
To build the examples run:
mvn package
After build, there is a jar under target/streaming-java-template-$VERSION.jar
.
- Start the gearpump cluster (0.4)
a) Download from http://www.gearpump.io/site/downloads/
b) After extraction, start the local cluster
bin/local
c) Start the UI server
bin/services
- Submit the jar
bin/gear app -jar path/to/streaming-java-template-$VERSION.jar <app mainclass with package>
for example:
bin/gear app -jar target/streaming-java-template-$VERSION.jar javatemplate.WordCount
- Check the UI http://127.0.0.1:8090/
NOTE:
Please use Java8 to run the cluster because Gearpump 0.8.0 only support Java 8.
You can set the ENV JAVA_HOME.
On windows: set JAVA_HOME={path_to_java_8}
On Linux export JAVA_HOME={path_to_java_8}
Very simple example that shows how to read and write from Kafka topics.
The example makes use of Gearpump Connector API, KafkaSource
and KafkaSink
, that make simple operations with Kafka super easy.
When defining Kafka source, you'll need to provide topic name and zookeeper location:
KafkaSource kafkaSource = new KafkaSource("inputTopic", "localhost:2181");
When defining Kafka sink (output), you will just give the destination topic name and Kafka broker address:
KafkaSink kafkaSink = new KafkaSink("outputTopic", "localhost:9092");
Keep in mind, that Kafka source processor produces message as byte array (byte[]
).
Also, Kafka sink processor expects the message to be scala.Tuple.
The example shows dedicated steps that do the necessary conversions. (The conversions don't need to be a separate step, you could include them in other task that do actual computation.)
This example uses zookeeper and Kafka. You need to set them up before running.
Start zookeeper and Kafka:
zookeeper/bin/zkServer.sh start
kafka/bin/kafka-server-start.sh kafka/config/server.properties
(Tested with zookeeper 3.4.6 and Kafka 2.11-0.8.2.1. with default settings.)
The app will read messages from inputTopic
and write to outputTopic
, so you may need to create them beforehand:
kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic inputTopic
kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic outputTopic
After you prepared Kafka topics and deployed the app to gearpump cluster, you can start using it.
Start producing some messages to input topic:
kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic inputTopic
Check if anything appears on output topic:
kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic outputTopic --from-beginning
The tasks write to application logs, so you can browse them to see execution flow.
The logs should be under location similar to this:
$GEARPUMP_HOME/logs/applicationData/<<user>>/<<date>>/<<appN>>/