This project is dedicated to uncovering valuable business insights through the lens of statistical tests applied to supermarket sales data. With a specific focus on hypothesis testing, we aim to delve into the underlying patterns and relationships within the dataset.
Emphasizing a rigorous analytical approach, the project sets a margin of error at 5%, ensuring the precision of our statistical inferences. Through this exploration, we seek to not only understand the dynamics of supermarket sales but also contribute actionable findings to inform strategic business decisions.
- 2 sample non-parametric test (Mann-Whitney U test)
- Bootstrap estimates
- Normality Test
- ANOVA
- Chi-square Independence Test
- 2 Sample Proportion Test
- Total purchasing by gender
- Rating by branch
- Association between costumer type and product line
- Customer type by gender
- Python / Jupyter notebooks
- Scipy.stats
- Pingouin
- Statsmodels.stats
- Pandas
- Numpy
- Matplotlib
- Seaborn
This project illustrated how exploratory data analysis (EDA) and statistical hypothesis testing can address specific business requirements.
The insights gained from these analyses pave the way for further investigation and potential development of machine learning algorithms tailored to meet the specific needs of the business.
The results obtained here provide a foundation for informed decision-making and continuous improvement in line with business objectives.
The growth of supermarkets in most populated cities are increasing and market competitions are also high. The dataset is one of the historical sales of supermarket company which has recorded in 3 different branches for 3 months data.
- Invoice id: Computer generated sales slip invoice identification number
- Branch: Branch of supercenter (3 branches are available identified by A, B and C).
- City: Location of supercenters
- Customer type: Type of customers, recorded by Members for customers using member card and Normal for without member card.
- Gender: Gender type of customer
- Product line: General item categorization groups - Electronic accessories, Fashion accessories, Food and beverages, Health and beauty, Home and lifestyle, Sports and travel
- Unit price: Price of each product in $
- Quantity: Number of products purchased by customer
- Tax: 5% tax fee for customer buying
- Total: Total price including tax
- Date: Date of purchase (Record available from January 2019 to March 2019)
- Time: Purchase time (10am to 9pm)
- Payment: Payment used by customer for purchase (3 methods are available – Cash, Credit card and Ewallet)
- COGS: Cost of goods sold
- Gross margin percentage: Gross margin percentage
- Gross income: Gross income
- Rating: Customer stratification rating on their overall shopping experience (On a scale of 1 to 10)