Yellow Taxi Trips Data Analytics | Data Engineering Azure Project

Introduction

The "Yellow Taxi Trips Data Analytics" project uses modern technology and data analysis to extract valuable insights from New York City's yellow taxi trip records. I'm employing a range of advanced tools like Python, SQL, Azure services, and Power BI to process, analyze, and visualize the data.

Architecture

Technologies Used

Python
SQL
Azure Data Factory
Azure Data Bricks
Azure Synapse Analytics
Power BI

Dataset Used

Source : https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
Data Dictionary : https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

The data is separated by months for each year, so I created a simple Python script to download all the Parquet files and combine them by year. The dataset is stored in .parquet.gzip format to be cost-effective for storage. But since it were too large to be stored on GitHub (without Git LFS), reducing the file size and using CSV/Parquet format is the best solution by filtering the rows for this side project use. Here, first 20,000 rows randomly selected from each month will be used.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
.gitignore		.gitignore
README.md		README.md
concat_data.ipynb		concat_data.ipynb
data_model.jpeg		data_model.jpeg
dataset_URL.json		dataset_URL.json
download.ipynb		download.ipynb
storage_mount.ipynb		storage_mount.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yellow Taxi Trips Data Analytics | Data Engineering Azure Project

Introduction

Architecture

Technologies Used

Dataset Used

Data Model

Insights

About

Releases

Packages

Languages

aimanamri/yellow-taxi-trips-etl-data-engineering-project

Folders and files

Latest commit

History

Repository files navigation

Yellow Taxi Trips Data Analytics | Data Engineering Azure Project

Introduction

Architecture

Technologies Used

Dataset Used

Data Model

Insights

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages