Skip to content

rpalloni/flowders

Repository files navigation

Flow of orders - flowders

A simple project where a producer generates random orders data and send them to a kafka cluster (KRaft). There are 3 brokers and one 'orders' topic with 10 partitions, one for each customer (partition key). Spark cluster with 2 workers and 4 executors reads the kafka streaming from topic and calculates aggregations per customer every minute.

image

Input example:

{"order_id":4444,"customer_id":CUS_98LKLSKX,"amount":608.21,"timestamp":"2025-09-22T19:27:05.425770"}
{"order_id":3560,"customer_id":CUS_98LKLSKX,"amount":204.67,"timestamp":"2025-09-22T19:27:11.433860"}
{"order_id":7517,"customer_id":CUS_O2LV2E0Q,"amount":322.45,"timestamp":"2025-09-22T19:27:20.438745"}
{"order_id":6987,"customer_id":CUS_RCTQC56L,"amount":866.6,"timestamp":"2025-09-22T19:27:26.443795"}
{"order_id":3520,"customer_id":CUS_FCPCPQOK,"amount":215.33,"timestamp":"2025-09-22T19:27:36.450808"}
{"order_id":1097,"customer_id":CUS_FCPCPQOK,"amount":668.06,"timestamp":"2025-09-22T19:27:41.456560"}
{"order_id":2549,"customer_id":CUS_5W0GO8GP,"amount":616.46,"timestamp":"2025-09-22T19:27:48.461685"}
{"order_id":2662,"customer_id":CUS_HB6C22TP,"amount":757.36,"timestamp":"2025-09-22T19:27:51.465676"}
{"order_id":8961,"customer_id":CUS_1XE8U8UW,"amount":327.33,"timestamp":"2025-09-22T19:27:57.471678"}

Output example:

+------------------------------------------+--------------+-------------+-------------+
|window                                    |customer_id   |client_orders|client_amount|
+------------------------------------------+--------------+-------------+-------------+
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_O2LV2E0Q  |1            |322.45       |
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_IFNDUZGA  |1            |757.36       |
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_FCPCPQOK  |2            |883.39       |
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_JMUUMX57  |1            |866.60       |
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_RCTQC56L  |1            |327.33       |
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_98LKLSKX  |2            |812.88       |
|{2025-09-22 19:27:00, 2025-09-22 19:28:00}|CUS_5W0GO8GP  |1            |616.46       |
+------------------------------------------+--------------+-------------+-------------+

UI

Kafka UI: localhost:8888

image image

Spark App UI: localhost:4040

image

Spark Master UI: localhost:8080

image

Spark Worker1 UI: localhost:8081 Spark Worker2 UI: localhost:8082

Run project

docker compose up --build

kafka CLI

You can use any of the broker as bootstrap broker to access the cluster: docker compose exec kafka1 bash

/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server kafka1:9092

/opt/kafka/bin/kafka-topics.sh --describe --topic orders --bootstrap-server kafka1:9092

/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic orders --partition 5 --from-beginning

Coming soon

Sink to Apache Iceberg

About

Flow of orders - Kafka cluster + Spark cluster in a streaming pipeline. Deep dive on Kafka murmur2

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors