In this project, we performed a comprehensive analysis of Walmart’s sales data using PostgreSQL as the primary database management system. The goal was to explore key performance indicators such as branch performance, product trends, customer behavior, and revenue patterns. This analysis serves as a foundation for making data-driven decisions to optimize operations, enhance customer satisfaction, and improve profitability.
- Database Platform: PostgreSQL was used for executing all SQL queries, ensuring robust data management and high-performance analytics.
- PostgreSQL Extensions: Advanced SQL functions, such as
CASE
,TO_CHAR
, andROUND
, were used to manipulate and analyze the data effectively. - File Outputs: Results were exported into
.csv
files for reporting and further analysis.
The dataset consists of sales transactions recorded across Walmart’s branches, with fields capturing details such as product categories, customer demographics, sales amounts, tax rates, and customer ratings. Below are the key attributes of the dataset:
Column Name | Description |
---|---|
invoice_id |
Unique identifier for each transaction |
branch_code |
Branch identifier (A, B, C) |
city_name |
Name of the city where the branch is located |
customer_category |
Customer type (e.g., Member, Normal) |
gender |
Gender of the customer |
product_category |
Category of the product purchased |
unit_cost |
Cost per unit of the product |
quantity_sold |
Number of units sold in the transaction |
tax_rate |
Tax percentage applied to the transaction |
total_sales |
Total amount of the transaction including tax |
transaction_date |
Date of the transaction |
transaction_time |
Time of the transaction |
payment_method |
Payment method used |
cost_of_goods |
Cost of goods sold |
gross_margin |
Margin percentage |
gross_profit |
Profit earned from the transaction |
customer_rating |
Customer satisfaction rating (1-10 scale) |
The table sales_data
was designed with a structured schema, ensuring efficient storage and analysis of sales data. Additional calculated columns, such as time_period
, day_of_week
, and month_name
, were added to facilitate time-based analysis.
- A PostgreSQL database was created, and the
sales_data
table was initialized with the specified schema.
- Three derived columns were added:
time_period
: Categorized transaction times into Morning, Afternoon, and Evening.day_of_week
: Extracted day names from the transaction date.month_name
: Extracted month names from the transaction date.
Queries were executed to explore:
- Branch and city performance
- Product trends
- Customer demographics
- Time-based patterns
- Seasonal revenue insights
The results of each query were exported into .csv
files for reporting and visualization.
-
Branch and City Performance:
- Revenue comparison across branches and cities.
- Analysis of tax rates in each city.
-
Product Analysis:
- Best-selling product categories by quantity.
- Revenue generated by product categories.
- Gender preferences for product categories.
-
Customer Behavior:
- Distribution of customer ratings.
- Best days and times for customer ratings.
- Gender distribution across branches.
-
Revenue Insights:
- Revenue trends by month and quarter.
- Contributions of customer types to overall revenue.
- VAT analysis across customer types.
-
All Branches:
- Walmart operates across three cities: Yangon, Mandalay, and Naypyitaw, with branch codes A, B, and C respectively.
-
Branch Revenue:
- The analysis shows the revenue performance of each branch:
- Branch C (Naypyitaw): $110,568.71
- Branch A (Yangon): $106,200.37
- Branch B (Mandalay): $106,197.67
- Insights:
- While all branches perform comparably, Branch C leads slightly, suggesting effective operations or higher customer demand in Naypyitaw.
- The analysis shows the revenue performance of each branch:
-
City Tax Rates:
- Average tax rates for each city were calculated:
- Naypyitaw: 16.05%
- Mandalay: 15.23%
- Yangon: 14.87%
- Insights:
- Naypyitaw has the highest tax rate, which could impact pricing strategies or customer behavior.
- Average tax rates for each city were calculated:
-
Best-Selling Product Categories by Quantity:
- The most frequently purchased product categories:
- Electronic Accessories: 971 units
- Food and Beverages: 952 units
- Sports and Travel: 920 units
- Home and Lifestyle: 911 units
- Insights:
- Electronic Accessories and Food and Beverages are the most demanded categories, making them key focus areas for stocking and promotions.
- The most frequently purchased product categories:
-
Top Product Categories by Revenue:
- The highest revenue-generating product categories:
- Food and Beverages: $56,144.84
- Sports and Travel: $55,122.83
- Electronic Accessories: $54,337.53
- Insights:
- While Electronic Accessories lead in sales quantity, Food and Beverages bring in the highest revenue due to higher per-unit pricing.
- The highest revenue-generating product categories:
-
Popular Product Categories by Gender:
- The top product preferences by gender:
- Female Preferences:
- Fashion Accessories (96 purchases)
- Food and Beverages (90 purchases)
- Male Preferences:
- Health and Beauty (88 purchases)
- Electronic Accessories (86 purchases)
- Female Preferences:
- Insights:
- Females show higher interest in fashion, while males lean towards health-related and electronic products.
- The top product preferences by gender:
-
Customer Ratings Distribution:
- Ratings were categorized into bins, showing overall satisfaction:
- 6-8: 351 ratings (highest)
- 8-10: 308 ratings
- 4-6: 330 ratings
- 2-4: 11 ratings
- Insights:
- A significant portion of customers rated their experience 6 or above, reflecting positive overall satisfaction.
- Ratings were categorized into bins, showing overall satisfaction:
-
Average Ratings by Time of Day:
- Average customer ratings during different times:
- Afternoon: 7.03
- Morning: 6.96
- Evening: 6.93
- Insights:
- Ratings are slightly higher in the afternoon, which might be due to less crowded stores or better service.
- Average customer ratings during different times:
-
Best Day for Ratings:
- Average customer ratings by day of the week:
- Monday: 7.15
- Friday: 7.08
- Sunday: 7.01
- Insights:
- Monday is the highest-rated day, indicating effective operations or positive customer experiences.
- Average customer ratings by day of the week:
-
Gender Distribution Across Branches:
- Male and female customer distribution by branch:
- Branch A: 179 males, 161 females
- Branch B: 170 males, 162 females
- Branch C: 150 males, 178 females
- Insights:
- Branch C sees more female customers, highlighting a potential focus on female-oriented products.
- Male and female customer distribution by branch:
-
Monthly Revenue:
- Revenue trends by month:
- January: $116,291.87
- March: $109,455.51
- February: $97,219.37
- Insights:
- January leads in revenue, possibly due to holiday or New Year shopping.
- Revenue trends by month:
-
Quarterly Trends:
- Total revenue for Q1: $322,966.75
- Insights:
- Q1 shows strong sales, indicating seasonal demand patterns.
-
Revenue by Customer Type:
- Contribution to revenue by customer category:
- Members: $164,223.44
- Normal Customers: $158,743.31
- Insights:
- Members contribute slightly more revenue, highlighting the effectiveness of loyalty programs.
- Contribution to revenue by customer category:
-
Time-Based Revenue:
- Revenue trends by time of day:
- Evening: $138,370.92
- Afternoon: $122,797.02
- Morning: $61,798.81
- Insights:
- The evening period generates the highest revenue, likely due to post-work shopping.
- Revenue trends by time of day:
-
VAT Contributions Across Customer Types:
- Average tax rates paid:
- Members: 15.61%
- Normal Customers: 15.15%
- Insights:
- Members pay a slightly higher average tax rate, reflecting higher-value purchases.
- Average tax rates paid:
-
Branch Optimization:
- Branch C outperforms slightly; replicating its strategies in other branches could boost performance.
-
Product Focus:
- Electronic Accessories and Food and Beverages require priority for inventory and promotion planning.
- Tailored gender-specific promotions can drive further sales.
-
Customer Behavior:
- Focus on improving customer experience during evening hours and on low-rated days like Wednesday.
-
Seasonal Planning:
- Prepare for peak Q1 demand, especially in January, by ensuring adequate inventory.
-
Tax Strategy:
- Review pricing strategies in Naypyitaw to counteract its high tax rates and maintain competitiveness.
This project highlights the power of SQL and PostgreSQL in uncovering actionable insights from sales data. The results provide Walmart with a roadmap for optimizing operations, targeting customer segments, and enhancing profitability through data-driven strategies.