This document offers a comprehensive overview of the process involved in creating a dimensional model for a data warehouse using the Northwind dataset, with some interpretations. The dimensional model aims to fulfill specific business requirements outlined by stakeholders and provide valuable insights into various aspects of the business.
The initial phase focused on identifying and understanding key business requirements through stakeholder meetings and discussions. The following requirements were identified:
- Churn Rate Calculation: To understand customer retention and attrition by calculating the churn rate for customers.
- Seasonal Best Sellers: Identify best-selling products during seasons with high sales rates.
- Shipper Evaluations: Evaluate shippers based on average shipping days and freight costs.
- Top Sales Employees: Determine top-performing sales employees based on sales performance.
- Returns Analysis: Explore causes of returns to predict root causes and take necessary actions.
- Customer Maturity Analysis: Analyze customer maturity based on average sales over time to ensure revenue optimization.
- Proactive Customer Engagement: Enhance customer experience and loyalty by initiating contact with customers before they initiate orders.
The dimensional model was designed to effectively address the identified business requirements. It consists of the following dimensions and fact tables:
- FactOrders: Contains transactional data related to orders.
- FactReturns: Stores information about returned orders.
- DimLocation: Contains information about geographical locations.
- DimCustomer: Stores details about customers.
- DimSupplier: Holds data regarding suppliers.
- DimEmployee: Contains information about employees.
- DimProduct: Stores details about products.
- DimShipper: Holds data related to shipping companies.
- DimDate: Contains date-related information.
After creating the dimensional model for the Northwind data warehouse, the next step involves optimizing query performance by creating indexes on columns that are frequently used in queries. The following columns have been identified as crucial for efficient query processing based on the business requirements:
- RequiredDate
- TotalSales
- CategoryName
- Year
- ReturnCause
- Maturity
- Foreign Key Indexing
More details about the indexing strategy and dimensional modeling included in the README file in
DWH Modeling
folder
- Business Insights:
- This folder includes the queries performed to meet the business requirements from the stakeholders. The README.md file presents visualizations of these queries.
- DWH Implementation:
- This folder contains the SQL scripts for creating the data warehouse and the indexes.
- DWH Modeling:
- This folder includes the dimensional modeling of the warehouse.
- Data Ingestion:
- This folder includes the SQL scripts for ingesting data into the created tables.
Postgres
for SQL implementationPower BI
for visualization of data