A Business-to-Business (B2B) SupplierβDistributor Case Study
- Overview
- Business Objectives
- Dataset Reference
- My Approach
- Descriptive Analysis | Key Findings
- Diagnostic Analysis | Root Cause Insights
- Recommendations
- Technical Highlights
- Repository Structure
- Authorβs Note
This project explores supply chain operations using SQL-based exploratory data analysis (EDA) to uncover insights on supplier efficiency, logistics cost optimization, and product performance.
The dataset represents a B2B supply chain where suppliers manufacture products and distribute them to regional hubs across cities (Mumbai, Delhi, Kolkata, Bangalore, Chennai).
Each transaction includes production data, logistics routes, transportation costs, and carrier information.
Goal: Identify inefficiencies, cost drivers, and improvement opportunities across suppliers, carriers, and routes.
- Identify top-performing suppliers by revenue, sales volume, and efficiency.
- Assess manufacturing and transportation cost efficiency.
- Evaluate supplier, route, and carrier performance.
- Detect relationships between cost, lead time, and defect rate.
- Provide actionable recommendations to optimize operations.
Validated whether revenue_generated matched price Γ products_sold:
SELECT
product_type, SKU, price, products_sold, revenue_generated,
(price * products_sold) AS expected_revenue,
ROUND((price * products_sold) - revenue_generated, 2) AS difference
FROM supply_chain;Finding:
The revenue_generated column was inconsistent.
Action: Added a new clean_revenue column for accurate financial calculations.
ALTER TABLE supply_chain ADD COLUMN clean_revenue DECIMAL(10,2);
UPDATE supply_chain SET clean_revenue = price * products_sold;Outcome: Ensured accurate revenue and profit calculations for all analyses.
-
Checked for missing and duplicate records.
- Verified data types.
- Dropped the erroneous column
lead_times(duplicate oflead_time). - Validated logical relationships (e.g.,
manufacturing_costas unit cost Γ production volume).
Outcome: Dataset was clean and consistent, ready for descriptive and diagnostic analysis.
The descriptive analysis focused on understanding the current performance of suppliers, routes, carriers, and regional operations.
Below are the main findings, where you can insert screenshots of the corresponding SQL query results.
5.1. Supplier Revenue and Sales Contribution
Supplier 1 and Supplier 3 emerged as the leading contributors in both total sales and revenue. This indicates that these suppliers have stronger operational capacity and better distributor relationships, making them valuable long-term partners.
5.2. Manufacturing Cost Analysis
Supplier 4 showed the highest manufacturing cost per unit. This could indicate higher production standards or inefficiencies within its manufacturing process.
5.3. Regional Distribution Patterns
The cities of Mumbai and Kolkata received the largest number of shipments, confirming their importance as key regional hubs in the distribution network.
5.4. Transportation Mode Efficiency
Road transport was the most frequently used mode of shipment, yet it wasnβt always the cheapest.
5.5. Carrier Performance
Carrier B consistently achieved the lowest average shipping cost per delivery, showing stronger operational performance and cost management compared to others.
5.6. Supplier Lead Time Performance
The average lead times across suppliers reveal key operational differences:
- Supplier 1 had a balanced cycle with an average manufacturing lead time of 13 days and overall lead time of 17 days.
- Supplier 3 demonstrated the fastest end-to-end cycle with 15 days manufacturing and 14 days total lead time β making it the most time-efficient supplier.
- Suppliers 2, 4, and 5 showed higher manufacturing or total lead times, implying slower throughput or capacity constraints.
These insights highlight Supplier 3 as a top performer in terms of production and logistics speed.
5.7. Route Utilization and Cost
Route A was the most frequently used, while Route B recorded the highest average transportation cost.
5.8. Defect Rate Evaluation
Supplier 5 recorded the highest defect rates, especially in shipments bound for Chennai. This suggests potential issues in quality control or handling during transit.
This section explores why those trends occur and what factors drive inefficiencies.
6.1. Manufacturing Cost vs. Defect Rate Relationship
A slight positive correlation was found between manufacturing cost and defect rate, implying that higher costs do not always guarantee better quality. Inefficient production may be inflating cost without improving outcomes.
6.2. Lead Time vs. Transportation Cost
Orders with shorter lead times were often associated with higher transportation costs, reaffirming the trade-off between delivery speed and logistics cost.
6.3. Route and Carrier Performance Impact
Route A and Carrier C were correlated with higher costs and longer delivery times, which could be due to distance, capacity constraints, or inconsistent service levels.
6.4. Regional Cost Variance
The variance in overall cost was largely driven by logistics rather than manufacturing differences. This suggests transportation optimization should be a top priority.
6.5. Supplier Efficiency Comparison
Supplier 2 consistently demonstrated both the highest cost and longest lead times, making it the most critical area for performance improvement.
Based on the descriptive and diagnostic insights, here are the key recommendations:
-
Strengthen Supplier 1 Partnership
Supplier 1 is the top contributor to sales and revenue. Negotiate long-term agreements or volume discounts to secure supply and lower cost per unit. -
Audit Supplier 2 Operations
With high costs and long lead times, Supplier 2βs processes should be reviewed. Identify inefficiencies or supply bottlenecks for improvement. -
Optimize Route A Logistics
Route A consistently yields the highest transportation cost. Consider route redesign, consolidation of shipments, or switching carriers to lower expenses. -
Leverage Carrier Bβs Cost Advantage
Carrier B demonstrated superior cost efficiency. Increasing its shipment share could lead to additional savings through volume-based pricing incentives. -
Enhance Quality Control in Chennai
Chennai-related shipments show higher defect rates. Reinforce inspection and handling protocols to improve end-product reliability.
These recommendations directly align with the business objective of improving cost efficiency, supplier reliability, and logistics performance within the B2B supply chain network.
- SQL (MySQL) β Data validation, cleaning, and analytics queries
- Data Integrity Analysis β Detecting and correcting inconsistent fields
- Descriptive & Diagnostic Analysis β Supplier, logistics, and profitability insights
- Business Communication β Translating technical findings into strategic actions
/SupplyChain_SQL_Project
β
βββ supply_chain_eda.sql β All SQL queries (cleaning + analysis)
βββ insights_summary.md β Business insights & recommendations
βββ screenshots/ β Optional: Query outputs
βββ dataset_reference.txt β Kaggle dataset link
βββ README.md β This documentation
This project was designed to simulate how a junior data analyst or supply chain analyst approaches a messy real-world dataset:
- Validating integrity,
- Cleaning inconsistencies,
- Deriving insights directly from SQL (without visualization tools),
- Translating technical work into business recommendations.










