dataingestionframework

Here are 6 public repositories matching this topic...

divithraju / divith-raju-Webapplication-Spark-memory-cal

The Spark Memory Configuration Calculator is designed to help data engineers and Spark developers quickly determine the optimal memory and core configurations for their Spark clusters. With this tool, you can avoid common pitfalls and ensure your cluster resources are used efficiently, leading to better performance and lower costs.

linux open-source calculator data database hadoop ubuntu apache project pyspark hdfs memory-allocation dataplatform dataprocessing lowcost project-repository dataingestionframework

Updated Aug 15, 2024
Python

divithraju / divith-aju-Hadoop-Pyspark-pipeline

Star

This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.

client documentation data database apache-spark pipeline bigdata project python3 pyspark hdfs software-engineering ecommerce-platform dataengineering datapreprocessing apache-hadoop-framework project-repository dataingestionframework

Updated Aug 17, 2024
Python

divithraju / divith-raju-PySpark-Projects

Star

linux data opensource web hadoop ubuntu bigdata apache project python3 pyspark hdfs software-engineering user dataprocessing dataengineering project-repository dataingestionframework movies-streaming

Updated Aug 15, 2024
Python

Vivek2693 / sensor2

Star

Automated Wafer Sensor Fault Detection with CI/CD Pipeline This project implements a system for wafer sensor fault detection using machine learning.

css html docker mongodb ci-cd datatransformation dataingestionframework

Updated Aug 10, 2023
Jupyter Notebook

divithraju / divith-raju-Pyspark-work

Star

linux data ubuntu bigdata delimiter apache project python3 avg software-engineering missing-data pivot dataprocessing dataengineering project-repository dataingestionframework

Updated Aug 15, 2024
Python

divithraju / divith-raju-Python

Star

This repository highlights my ability to develop and integrate diverse Python solutions, ranging from API creation and data management to cloud service integration. Each project in this repository serves a specific purpose, demonstrating both fundamental concepts and practical applications that are essential in real-world software development.

linux api aws data ubuntu bigdata apache project python3 coding software software-engineering nosql-database dataengineering project-repository dataingestionframework

Updated Aug 17, 2024
Python

Improve this page

Add a description, image, and links to the dataingestionframework topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataingestionframework topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataingestionframework

Here are 6 public repositories matching this topic...

divithraju / divith-raju-Webapplication-Spark-memory-cal

divithraju / divith-aju-Hadoop-Pyspark-pipeline

divithraju / divith-raju-PySpark-Projects

Vivek2693 / sensor2

divithraju / divith-raju-Pyspark-work

divithraju / divith-raju-Python

Improve this page

Add this topic to your repo