Data Migration with Azure Synapse, PySpark, and Delta Tables

This project sets up an end-to-end data pipeline that transforms and processes historical data from a SQL Database into a structured format in Azure Synapse Analytics, utilizing Azure Data Lake Storage (ADLS) and Delta Tables for efficient storage and querying.

Tech Stack

Python
Azure SQL Database
T-SQL (Transact-SQL)
Azure Synapse Analytics
Azure Data Lake Storage (ADLS)
Azure Logic App
Azure Notebook
PySpark
Delta Tables

Pipeline Overview

The data pipeline consists of the following stages:

Pipeline Overview

The data pipeline consists of the following stages:

Bronze Layer: Raw, unprocessed data directly coming from each table stored in the Azure SQL database. All these tables are stored in Parquet format in Azure Data Lake Storage (ADLS) for further processing.
Silver Layer: Cleaned and transformed data stored as Delta Tables for optimized querying and performance.
Gold Layer: The final, optimized dataset containing dimension and fact tables, designed for high-performance analytics and reporting.

For detailed activity descriptions, see Pipeline Activities.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
docs		docs
notebooks		notebooks
sql		sql
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Migration with Azure Synapse, PySpark, and Delta Tables

Tech Stack

Table of Contents

Pipeline Overview

Pipeline Overview

About

Releases

Packages

Languages

Adarsh-Hota/Fintech_Data_Migration

Folders and files

Latest commit

History

Repository files navigation

Data Migration with Azure Synapse, PySpark, and Delta Tables

Tech Stack

Table of Contents

Pipeline Overview

Pipeline Overview

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages