You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OctopuFS library helps managing cloud storage, ADLSgen2 specifically. It allows you to operate on files (moving, copying, setting ACLs) in very efficient manner. Designed to work on databricks, but should work on any other platform as well.
Explore the Tokyo Olympics data journey! We ingested a GitHub CSV into Azure via Data Factory, stored it in Data Lake Storage Gen2, performed transformations in Databricks, conducted advanced analytics in Azure Synapse, and visualized insights in Synapse or Power BI.
This sample demonstrates how to create a Linux Virtual Machine in a virtual network that privately accesses a blob storage account using an Azure Private Endpoint.
Data Engineering Project on Supply Chain ETL. Creating a dynamic ADF pipeline to ingest both Full Load and Incremental Load data from SQL Server and then transform these datasets based on medallion architecture using Databricks.
COVID19-ADF is a project that leverages Azure services to collect, analyze, and visualize COVID-19 data. With seamless data integration and advanced analytics, it provides valuable insights into the pandemic's impact, enabling informed decision-making in the fight against COVID-19.
This repo contains code specific to the SQL-driven spark aggregation framework to be executed in the Databricks cluster that integrates with the Azure storage account.
This project builds a cloud-based pipeline to extract NYC taxi data from an API and store it in Azure Data Lake Storage (ADLS). Databricks and PySpark are used to transform the data through the medallion architecture (Bronze → Silver → Gold). Delta Lake ensures reliable storage, and Power BI provides visual insights for data-driven decision-making.
"Explore Formula 1 data analytics with this project. Leveraging the Ergast API, it utilizes Databricks Spark for ingestion, transformation, and analysis. ADLS acts as the storage layer, while Power BI visualizes the ADLS presentation layer. Uncover insights in the world of Formula 1 through powerful data analytics."