This project is a Big Data Modelling project in order to bring out the real time analysis of the Uber data such as :-
- The regions where the taxi services are used frequently
- The locations where the maximum pickups and drops have been encountered
- The details about the passenger counts per ride
- The fare amounts of different trips, their distances and also payment type chosen often by the customers
- The total amount generated from a trip including fare_amount, extra, mta_tax, tip_amount, tolls_amount and improvement surcharge.
ADITYA SHARMA
~ Visualised the raw data into Fact and Dimension tables using Lucidcharts.com
~ Wrote the transformation logic using Pandas in Python to model the Big Data as per the planned format on Jupyter notebooks
~ Extracted the Dimension tables from the Big Data using Python and connected them with the Fact table using Primary and Foreign keys
ADITYA MISHRA
~ Implemented the Data Loader, Transformer and Data Exporter, ETL Pipeline using Mage.Ai, to finally convert the 100k column Raw data into the said Structure.
~ Migrated the transformed Data to Google Cloud Platform.
~ Loaded the transformed data into bigQuery on Google Cloud Platform
ANKIT GHOSH
~ Analysed and ran queries on 15000KB+ data using MySQL on bigQuery
~ Grouped and Joined the Dimension tables with the Fact table
~ Presented the final data on the dashboard, created using Google’s Looker Studio for analysing the refined data.
Dashboard Link Here : https://lookerstudio.google.com/s/rramPly6jJw
programming Language: Python
Google Cloud Platform: Google Storage, Big Query, Looker Studio
Modern Data Pipeline Tool: https://www.mage.ai/