Skip to content

This project leveraged PostgreSQL to analyze Walmart sales data, uncovering key insights into branch performance, product trends, and customer behavior. Data was imported, analyzed through structured queries, and results were exported to support actionable business strategies.

Notifications You must be signed in to change notification settings

Sanju-srivatsa/Walmart-Sales-Analysis-Project-PostgreSQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Walmart-Sales-Analysis-Project-PostgreSQL


1. Introduction

In this project, we performed a comprehensive analysis of Walmart’s sales data using PostgreSQL as the primary database management system. The goal was to explore key performance indicators such as branch performance, product trends, customer behavior, and revenue patterns. This analysis serves as a foundation for making data-driven decisions to optimize operations, enhance customer satisfaction, and improve profitability.


2. Tools and Technologies

  1. Database Platform: PostgreSQL was used for executing all SQL queries, ensuring robust data management and high-performance analytics.
  2. PostgreSQL Extensions: Advanced SQL functions, such as CASE, TO_CHAR, and ROUND, were used to manipulate and analyze the data effectively.
  3. File Outputs: Results were exported into .csv files for reporting and further analysis.

3. Dataset Description

The dataset consists of sales transactions recorded across Walmart’s branches, with fields capturing details such as product categories, customer demographics, sales amounts, tax rates, and customer ratings. Below are the key attributes of the dataset:

Column Name Description
invoice_id Unique identifier for each transaction
branch_code Branch identifier (A, B, C)
city_name Name of the city where the branch is located
customer_category Customer type (e.g., Member, Normal)
gender Gender of the customer
product_category Category of the product purchased
unit_cost Cost per unit of the product
quantity_sold Number of units sold in the transaction
tax_rate Tax percentage applied to the transaction
total_sales Total amount of the transaction including tax
transaction_date Date of the transaction
transaction_time Time of the transaction
payment_method Payment method used
cost_of_goods Cost of goods sold
gross_margin Margin percentage
gross_profit Profit earned from the transaction
customer_rating Customer satisfaction rating (1-10 scale)

4. Table Structure

The table sales_data was designed with a structured schema, ensuring efficient storage and analysis of sales data. Additional calculated columns, such as time_period, day_of_week, and month_name, were added to facilitate time-based analysis.


5. Project Steps

5.1 Database Setup

  • A PostgreSQL database was created, and the sales_data table was initialized with the specified schema.

5.2 Data Preprocessing

  • Three derived columns were added:
    1. time_period: Categorized transaction times into Morning, Afternoon, and Evening.
    2. day_of_week: Extracted day names from the transaction date.
    3. month_name: Extracted month names from the transaction date.

5.3 Exploratory Data Analysis (EDA)

Queries were executed to explore:

  • Branch and city performance
  • Product trends
  • Customer demographics
  • Time-based patterns
  • Seasonal revenue insights

5.4 Query Outputs

The results of each query were exported into .csv files for reporting and visualization.


6. Key Areas Covered

  1. Branch and City Performance:

    • Revenue comparison across branches and cities.
    • Analysis of tax rates in each city.
  2. Product Analysis:

    • Best-selling product categories by quantity.
    • Revenue generated by product categories.
    • Gender preferences for product categories.
  3. Customer Behavior:

    • Distribution of customer ratings.
    • Best days and times for customer ratings.
    • Gender distribution across branches.
  4. Revenue Insights:

    • Revenue trends by month and quarter.
    • Contributions of customer types to overall revenue.
    • VAT analysis across customer types.

7. Results and Insights

Branch and City Performance

  1. All Branches:

    • Walmart operates across three cities: Yangon, Mandalay, and Naypyitaw, with branch codes A, B, and C respectively.
  2. Branch Revenue:

    • The analysis shows the revenue performance of each branch:
      • Branch C (Naypyitaw): $110,568.71
      • Branch A (Yangon): $106,200.37
      • Branch B (Mandalay): $106,197.67
    • Insights:
      • While all branches perform comparably, Branch C leads slightly, suggesting effective operations or higher customer demand in Naypyitaw.
  3. City Tax Rates:

    • Average tax rates for each city were calculated:
      • Naypyitaw: 16.05%
      • Mandalay: 15.23%
      • Yangon: 14.87%
    • Insights:
      • Naypyitaw has the highest tax rate, which could impact pricing strategies or customer behavior.

Product Analysis

  1. Best-Selling Product Categories by Quantity:

    • The most frequently purchased product categories:
      • Electronic Accessories: 971 units
      • Food and Beverages: 952 units
      • Sports and Travel: 920 units
      • Home and Lifestyle: 911 units
    • Insights:
      • Electronic Accessories and Food and Beverages are the most demanded categories, making them key focus areas for stocking and promotions.
  2. Top Product Categories by Revenue:

    • The highest revenue-generating product categories:
      • Food and Beverages: $56,144.84
      • Sports and Travel: $55,122.83
      • Electronic Accessories: $54,337.53
    • Insights:
      • While Electronic Accessories lead in sales quantity, Food and Beverages bring in the highest revenue due to higher per-unit pricing.
  3. Popular Product Categories by Gender:

    • The top product preferences by gender:
      • Female Preferences:
        • Fashion Accessories (96 purchases)
        • Food and Beverages (90 purchases)
      • Male Preferences:
        • Health and Beauty (88 purchases)
        • Electronic Accessories (86 purchases)
    • Insights:
      • Females show higher interest in fashion, while males lean towards health-related and electronic products.

Customer Behavior

  1. Customer Ratings Distribution:

    • Ratings were categorized into bins, showing overall satisfaction:
      • 6-8: 351 ratings (highest)
      • 8-10: 308 ratings
      • 4-6: 330 ratings
      • 2-4: 11 ratings
    • Insights:
      • A significant portion of customers rated their experience 6 or above, reflecting positive overall satisfaction.
  2. Average Ratings by Time of Day:

    • Average customer ratings during different times:
      • Afternoon: 7.03
      • Morning: 6.96
      • Evening: 6.93
    • Insights:
      • Ratings are slightly higher in the afternoon, which might be due to less crowded stores or better service.
  3. Best Day for Ratings:

    • Average customer ratings by day of the week:
      • Monday: 7.15
      • Friday: 7.08
      • Sunday: 7.01
    • Insights:
      • Monday is the highest-rated day, indicating effective operations or positive customer experiences.
  4. Gender Distribution Across Branches:

    • Male and female customer distribution by branch:
      • Branch A: 179 males, 161 females
      • Branch B: 170 males, 162 females
      • Branch C: 150 males, 178 females
    • Insights:
      • Branch C sees more female customers, highlighting a potential focus on female-oriented products.

Revenue Insights

  1. Monthly Revenue:

    • Revenue trends by month:
      • January: $116,291.87
      • March: $109,455.51
      • February: $97,219.37
    • Insights:
      • January leads in revenue, possibly due to holiday or New Year shopping.
  2. Quarterly Trends:

    • Total revenue for Q1: $322,966.75
    • Insights:
      • Q1 shows strong sales, indicating seasonal demand patterns.
  3. Revenue by Customer Type:

    • Contribution to revenue by customer category:
      • Members: $164,223.44
      • Normal Customers: $158,743.31
    • Insights:
      • Members contribute slightly more revenue, highlighting the effectiveness of loyalty programs.
  4. Time-Based Revenue:

    • Revenue trends by time of day:
      • Evening: $138,370.92
      • Afternoon: $122,797.02
      • Morning: $61,798.81
    • Insights:
      • The evening period generates the highest revenue, likely due to post-work shopping.
  5. VAT Contributions Across Customer Types:

    • Average tax rates paid:
      • Members: 15.61%
      • Normal Customers: 15.15%
    • Insights:
      • Members pay a slightly higher average tax rate, reflecting higher-value purchases.

8. Key Takeaways

  1. Branch Optimization:

    • Branch C outperforms slightly; replicating its strategies in other branches could boost performance.
  2. Product Focus:

    • Electronic Accessories and Food and Beverages require priority for inventory and promotion planning.
    • Tailored gender-specific promotions can drive further sales.
  3. Customer Behavior:

    • Focus on improving customer experience during evening hours and on low-rated days like Wednesday.
  4. Seasonal Planning:

    • Prepare for peak Q1 demand, especially in January, by ensuring adequate inventory.
  5. Tax Strategy:

    • Review pricing strategies in Naypyitaw to counteract its high tax rates and maintain competitiveness.

9. Conclusion

This project highlights the power of SQL and PostgreSQL in uncovering actionable insights from sales data. The results provide Walmart with a roadmap for optimizing operations, targeting customer segments, and enhancing profitability through data-driven strategies.

About

This project leveraged PostgreSQL to analyze Walmart sales data, uncovering key insights into branch performance, product trends, and customer behavior. Data was imported, analyzed through structured queries, and results were exported to support actionable business strategies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published