Skip to content

DillipKumarNayak2000/New_York_taxi_Projects

Repository files navigation

Uber Data Analytics | Modern Data Engineering GCP Project

Introduction

Developed and implemented an end-to-end ETL pipeline using Mage.ai to extract, transform, and load Uber dataset into Google BigQuery for data analysis, Utilized Google Cloud Storage to store and manage raw data files throughout the data processing workflow, Structured and optimized the data in BigQuery for fast, scalable querying and reporting. Created interactive data visualizations and dashboards using Looker Studio to present insights on ride patterns, peak usage times, and customer behavior.Ensured data accuracy and performance by applying best practices in cloud-based data engineering.

Architecture

image the

Technology Used

  1. Programming Language(Python)
  2. MySql
  3. Google Cloud Platform
    • BigQuery
    • Cloud Storage
    • Looker Studio
    • Compute Instance
  4. Mage.AI(Modern data Pipeline tool)

Modern Data Pipeline Tool - : https://www.mern.ai/

DataSet Used

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

More Info About Dataset

Data Model

Data_Model_image

Scripts For Project

  1. Extract.py
  2. Transformation.py
  3. Load.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published