supreme-pancake

Repo for Big Data Management project

Three components were created in this project, a producer / data collector (kafka), a distributed database (CassandraDB) and a consumer / data processor (Spark).
The collection of data from a network of sensors was simulated, which then had to be processed and stored in a distributed and efficient way. The data collected (or generated) by kafka were then processed by spark and saved for long-term archiving on cassanda db.
The connection between the PCs has been made simple and scalable using Zerotier.

Leave a star ⭐ if you like this project 🙂 thank you.

What's inside

Kafka module
Cassanda db module
Spark module
Data cleaning scripts
Distributed job start and stop scripts
Project runme script
Project document with details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

supreme-pancake

What's inside

Files

README.md

Latest commit

History

README.md

File metadata and controls

supreme-pancake

What's inside