This module demonstrates the following:
- The use of the Kafka Streams DSL, including
aggregate()
,windowedBy().advanceBy()
,groupByKey()
,selectKey()
,toStream()
andpeek()
. - Unit testing using Topology Test Driver.
In this module, records of type <String, KafkaUser>
are streamed from a topic named USER_TOPIC
.
The following tasks are performed:
- Group the stream by last name using
groupByKey()
operation. - Apply an aggregator that combines each
KafkaUser
record with the same last name into aKafkaUserGroup
object and aggregates the first names by last name. - The aggregations are performed using a 5-minute time window with a 2-minute hop, and a 1-minute grace period for delayed records.
- Write the resulting records to a new topic named
USER_AGGREGATE_HOPPING_WINDOW_TOPIC
.
The output records will be in the following format:
{"firstNameByLastName":{"Last name 1":{"First name 1", "First name 2", "First name 3"}}}
{"firstNameByLastName":{"Last name 2":{"First name 4", "First name 5", "First name 6"}}}
{"firstNameByLastName":{"Last name 3":{"First name 7", "First name 8", "First name 9"}}}
To compile and run this demo, you will need the following:
- Java 21
- Maven
- Docker
To run the application manually:
- Start a Confluent Platform in a Docker environment.
- Produce records of type
<String, KafkaUser>
to a topic namedUSER_TOPIC
. You can use the producer user to do this. - Start the Kafka Streams application.
To run the application in Docker, use the following command:
docker-compose up -d
This command will start the following services in Docker:
- 1 Kafka broker (KRaft mode)
- 1 Schema registry
- 1 Control Center
- 1 producer User
- 1 Kafka Streams Aggregate Hopping Window