Skip to content

A real-time data pipeline using Kafka, Spark, and Cassandra for processing and storing credit card expenses. Includes a Spring Boot application for retrieving personnel data from MySQL, storing images in S3, and displaying employee details with expense reports on a web interface.

License

Notifications You must be signed in to change notification settings

FatihArslan-cmd/Kafka-Spark-Cassandra-Expense-Tracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kafka-Spark-Cassandra-Expense-Tracker SpringBoot Web App



Spring Boot Logo AWS S3 Logo AWS S3 Logo AWS S3 Logo AWS S3 Logo



A project that integrates Spring Boot, PostgreSQL, and AWS S3 Kafka Spark Cassandra to manage employee data


📖 Table of Contents

  1. 📘 About The Project
  2. 🚀 Getting Started
  3. 📦 Dependencies
  4. Screenshots
  5. 🤝 Contributing
  6. 📞 Contact

📘 About The Project

Key Features:

  • 🗄️ PostgreSQL Database Integration: Employee and department data are stored in PostgreSQL, with data imported from CSV files for easy initialization.
  • 🖼️ AWS S3 Image Storage: Employee images are stored in AWS S3 for secure and scalable image storage.
  • 📋 Web Interface: Displays employee details (name, manager name, salary, commission, department) with a JOIN operation, allowing for easy management and viewing.
  • Data comes constantly from cassandra

GIFS

AWS S3 Logo AWS S3 Logo

🚀 Getting Started

To get a local copy up and running, follow these steps.

📋 Prerequisites

Ensure you have the following software installed:

  • Java 17+
  • Maven
  • apache-cassandra-3.11.10
  • kafka_2.12-3.9.0
  • spark-2.4.5-bin-hadoop2.7
  • AWS CLI (for AWS S3 integration)
  • PostgreSQL

⚙️ Installation

  1. Clone the repository:
    git clone https://github.com/FatihArslan-cmd/Kafka-Spark-Cassandra-Expense-Tracker.git
  2. Navigate to the project directory:
    cd demo
  3. Install dependencies:
    mvn clean install
  4. Run the project:
    mvn spring-boot:run

Set up AWS S3 Bucket:

  1. Create an S3 bucket and upload sample images from this link images.
  2. Configure your AWS credentials using aws configure.

Set up PostgreSQL:

  1. Import employee and department data from the provided CSV files into PostgreSQL from data.

Set up Kafka Spark Cassandra:

  1. [Follow the link](https://github.com/FatihArslan-cmd/DataGenerator-Kafka-)

🔑 Configuration

Add the following keys to your application.properties file:


aws.accessKeyId=""
aws.secretAccessKey=""
aws.region=""
aws.bucketName=""

spring.cassandra.contact-points=
spring.cassandra.port=
spring.cassandra.keyspace-name=
spring.cassandra.local-datacenter=
spring.cassandra.schema-action=none

spring.datasource.url=""
spring.datasource.username=""
spring.datasource.password=""

🛠️ Usage

Once the project is running:


Important

  • Java 17
  • Spring Boot
  • PostgreSQL
  • AWS SDK for Java (for S3 integration)
  • Maven (for build management)

Additionally, Apache Kafka, Apache Spark, and Cassandra are configured to run in an environment with Java 8. These components should be executed under a dedicated user profile set up with Java 8 on Ubuntu. Meanwhile, Spring Boot applications, which require Java 17, should be executed under a separate user profile configured with Java 17 to ensure compatibility.


📦 Dependencies

  • Java 17
  • Spring Boot
  • PostgreSQL
  • AWS SDK for Java (for S3 integration)
  • Maven (for build management)

🤝 Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📞 Contact

Fatih Arslan - Software Engineer

About

A real-time data pipeline using Kafka, Spark, and Cassandra for processing and storing credit card expenses. Includes a Spring Boot application for retrieving personnel data from MySQL, storing images in S3, and displaying employee details with expense reports on a web interface.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published